Categorización de texto sensible al coste para el filtrado de contenidos inapropiados en Internet

  1. Puertas Sanz, Enrique
  2. Carrero García, Francisco
  3. Buenaga Rodríguez, Manuel de
  4. Gómez Hidalgo, José María
Journal:
Procesamiento del lenguaje natural

ISSN: 1135-5948

Year of publication: 2003

Issue: 31

Pages: 13-20

Type: Article

More publications in: Procesamiento del lenguaje natural

Abstract

The access to inapropiate Internet content is an increasing problem that can be approached as a cost-sensitive Automated Text Categorization task. In this paper, we report a series of experiments that compare a representative range of learning algorithms and methods for making them cost-sensitive, on two Web pages collections in Spanish and English. The results of our experiments are promising.