Extensions

InfoChem offers additional modules for ICANNOTATOR, enabling extraction of different entities in multiple languages.

LANGUAGE EXTENSIONS

We offer two language extension packs, the first one supports German, French and Russian texts. The second one supports Chinese, Japanese and Korean texts. Like the core module, these modules combine an algorithmic and a dictionary approach. The quality of these language packs has been checked by native speakers and resulted in an F-score between 0.8 and 0.9.

MATERIALS EXTENSION

With this extension pack, you can extract inorganics, metal organics (formulas and names) and polymers. It is often not possible to create a clear structure for these substances. In these cases we offer various output formats, e.g. the structure of monomers in the case of polymers, and molecular formulas or element systems for inorganic substances.

BIOLOGICALS EXTENSION

This module extracts genes, proteins and disease names. It uses large dictionaries that mainly source from GenBank, Uniprot and MeSH. For ambiguous protein names, we use a machine learning approach to detect false positives.

Contact us

Contact

Basic Customer Information
Customer Requests