Sistema de acceso a la información basado en conceptos utilizando Freebase en Español-Inglés sobre el dominio médico y turístico

  1. Muñoz Gil, Rafael
  2. Aparicio Galisteo, Fernando
  3. Buenaga Rodríguez, Manuel de
Journal:
Procesamiento del lenguaje natural

ISSN: 1135-5948

Year of publication: 2012

Issue: 49

Pages: 29-38

Type: Article

More publications in: Procesamiento del lenguaje natural

Abstract

In this paper we present a tool for access to information, based on semantic, focused both medical texts and tourists. Using marking techniques for recognized entities, the system can extract relevant concepts to provide more information about them, using collaborative databases and ontologies. Particularly relevant components to its the development are Freebase, a large collaborative base of knowledge and formal resources such as MedlinePlus and PubMed. The platform architecture has been built thinking in terms of scalability, in order to constitute a great platform for information integration, with the following objectives: to allow the integration of different natural language processing techniques, to expand the sources from which information extraction can be performed and to ease integration of new user interfaces.

Bibliographic References

  • Allan J., B. Carterette y J. Lewis. 2005. When will information retrieval be “good enough”? SIGIR Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
  • Aparicio, F., M. De Buenaga, M. Rubio, y A. Hernando. 2011a. An Intelligent Information Access system assisting a Case Based Learning methodology evaluated in higher education with medical students. Computers & Education.
  • Aparicio, F., R. Muñoz, M. Buenaga, y E. Puertas. 2011b. MDFaces: An intelligent system to recognize significant terms in texts from different domains using Freebase. En Procesamiento de Lenguaje Natural, 47, pp. 317-318.
  • Bollacker, K., C. Evans, P. Paritosh, T. Sturge y J. Taylor. 2008. Freebase: a collaboratively created graph database for structuring human knowledge. In SIGMOD '08: Proceedings of the 2008 ACM SIGMOD international conference on Management of data held in Vancouver, Canada, 1247-1250. ACM.
  • Brenes, D. J., D. G. Avello y K. P. González. 2009. Survey and evaluation of query intent detection methods. In WSCD '09: Proceedings of the 2009 workshop on Web Search Click Data held in Barcelona, Spain, 1-7. ACM.
  • Carrero, F., J. M. Gómez., M. de Buenaga, J. Mata, y M. Maña. 2007. Acceso a la información bilingüe utilizando ontologías específicas del dominio biomédico. Procesamiento de Lenguaje Natural, Vol. 38, pp. 107-117.
  • Chang, C., H. Kayed, M. Girgis y R. Shaalan. 2006. A Survey of Web Information Extraction Systems. IEEE Transactions on knowledge and data engineering. Volume: 18, Issue: 10
  • Clark, D. B., S. Touchman, M. Martinez-Garza, F. Ramirez-Marin, y T. Skjerping Drews. 2012. Bilingual language supports in online science inquiry environments. Computers & Education, 58(4), pp. 1207-1224.
  • Clifford, G., D. J. Scott y M. Villarroel 2010. User Guide and Documentation for the MIMIC II Database, Rev: 259. Cambridge, MA, USA.
  • Cunningham H. et al, 2002. GATE: A Framework and Graphical Development Environment for Robust NLP Tools and Applications. Proceedings of the 40th Anniversary Meeting of the Association for Computational Linguistics, Philadelphia.
  • De Melo, G., y G. Weikum. 2010. MENTA: inducing multilingual taxonomies from wikipedia. En Proceedings of the 19th ACM international conference on Information and knowledge management, CIKM’10, pp. 1099–1108, New York (USA)
  • Egozi, O., Markovitch S. y Gabrilovich E. 2011. Concept-Based Information Retrieval Using Explicit Semantic Analysis.ACM Transactions on Information systems. Vol. 29 Issue 2
  • Gutiérrez, Y., A. Fernández, A. Montoyo y S. Vázquez. 2010. Integración de recursos semánticos basados en WordNet. Sociedad Española para el Procesamiento del Lenguaje Natural. Revista 45.
  • Hamburg, M. A. y Collins F. S. The Path to Personalized Medicine. New England Journal Med 2010; 363:301-304
  • Han, X. y J. Zhao. 2009. CASIANED: Web Personal Name Disambiguation Based on Professional Categorization. En Proceedings of 2nd Web People Search Evaluation Workshop (WePS2), Madrid, Spain.
  • Knoth, P., T. Collins, E. Sklavounou, y Z. Zdrahal. 2010. Facilitating crosslanguage retrieval and machine translation by multilingual domain ontologies. En Workshop on Supporting eLearning with Language Resources and Semantic Data (at LREC 2010), Valletta (Malta).
  • Lew, M. S., N. Sebe, C. Djeraba y R. Jain Content-based multimedia information retrieval: state of the art and challenges. 2006. Journal ACM Transactions on Multimedia computing, Communications and Applications. Vol.2 Issue 1
  • Luo, G. y C. Tang. 2008. On iterative intelligent medical search. In SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval held in Singapore, Singapore, 3-10. ACM.
  • Mayfield, J., D. Lawrie, P. McNamee, y D. W. Oard. 2011. Building a Cross-Language Entity Linking Collection in Twenty-One Languages. En P. Forner, J. Gonzalo, J. Kekäläinen, M. Lalmas, & M. Rijke (Eds.), Multilingual and Multimodal Information Access Evaluation, Vol. 6941, pp. 3-13, Berlin (Heidelberg).
  • Müller, C., y I. Gurevych. 2009. Using Wikipedia and Wiktionary in domainspecific information retrieval. En Proceedings of the 9th Cross-language evaluation forum conference on Evaluating systems for multilingual and multimodal information access, CLEF’08, pp. 219–226, Berlin (Heidelberg)
  • Muñoz, R., F. Aparicio, M. De Buenaga. 2011. Tourist Face: A contents system base on concepts of freebase for Access to the cultural-tourist information. NLDB.
  • Nadeau, D. y Sekine S., 2007. A survey of named entity recognition and classification. En Linguisticae Investigationes, Vol. 30, pp. 3–26.
  • Navigil R. y P. Velardi. 2004. Learning Domain Ontologies from Document Warehouses and Dedicated Web Sites.
  • Tuchinda, R., C. Knoblock, A. y P. Szekely. Building Mashups by Demonstration. 2011. ACM Transactions on the Web (TWEB)
  • Voss, A., K. Nakata y M. Juhnke. Concept indexing. 1999. Proceedings of the Internation ACM SIGGROUP conference on Supporting group work.
  • Verdejo, F., J. Gonzalo, D. Fernández, A. Peñas y F. López. 2000. ITEM: un motor de búsqueda multilingüe basado en indexación semántica. Proceedings JBIDI.