Apache cTAKES

Developer(s)	Apache Software Foundation

Stable release	5.1.0 / May 16, 2024; 30 days ago (2024-05-16)

Repository	cTakes Repository
Written in	Java, Scala, Python
Operating system	Cross-platform
Type	Natural language processing, Bioinformatics, Text mining, Information Extraction
License	Apache License 2.0
Website	Official website

Apache cTAKES: clinical Text Analysis and Knowledge Extraction System is an open-source Natural Language Processing (NLP) system that extracts clinical information from electronic health record unstructured text. It processes clinical notes, identifying types of clinical named entities — drugs, diseases/disorders, signs/symptoms, anatomical sites and procedures. Each named entity has attributes for the text span, the ontology mapping code, context (family history of, current, unrelated to patient), and negated/not negated.^[1]

cTAKES was built using the UIMA Unstructured Information Management Architecture framework and OpenNLP natural language processing toolkit.^[2]^[3]

Components

Components of cTAKES are specifically trained for the clinical domain, and create rich linguistic and semantic annotations that can be utilized by clinical decision support systems and clinical research.^[4]

These components include:

Named Section identifier
Sentence boundary detector
Rule-based tokenizer
Formatted list identifier
Normalizer
Context dependent tokenizer
Part-of-speech tagger
Phrasal chunker
Dictionary lookup annotator
Context annotator
Negation detector
Uncertainty detector
Subject detector
Dependency parser
patient smoking status identifier
Drug mention annotator

History

Development of cTAKES began at the Mayo Clinic in 2006. The development team, led by Dr. Guergana Savova and Dr. Christopher Chute, included physicians, computer scientists and software engineers. After its deployment, cTAKES became an integral part of Mayo's clinical data management infrastructure, processing more than 80 million clinical notes.^[5]

When Dr. Savova's moved to Boston Children's Hospital in early 2010, the core development team grew to include members there. Further external collaborations include:^[5]

Such collaborations have extended cTAKES' capabilities into other areas such as Temporal Reasoning, Clinical Question Answering, and coreference resolution for the clinical domain.^[5]

In 2010, cTAKES was adopted by the i2b2 program and is a central component of the SHARP Area 4.^[5]

In 2013, cTAKES released their first release as an Apache Software Foundation incubator project: cTAKES 3.0.^{[citation needed]}

In March 2013, cTAKES became an Apache Software Foundation Top Level Project (TLP).^[5]

References

External links

The Apache Software Foundation

The Apache Software Foundation
Top-level projects	Accumulo ActiveMQ Airavata Airflow Allura Ambari Ant Aries Arrow Apache HTTP Server APR Avro Axis Axis2 Beam Bloodhound Brooklyn Calcite Camel CarbonData Cassandra Cayenne CloudStack Cocoon Cordova CouchDB cTAKES CXF Derby Directory Drill Druid Empire-db Felix Flex Flink Flume FreeMarker Geronimo Groovy Guacamole Gump Hadoop HBase Helix Hive Iceberg Ignite Impala Jackrabbit James Jena JMeter Kafka Kudu Kylin Lucene Mahout Maven MINA mod_perl MyFaces Mynewt NiFi NetBeans Nutch NuttX OFBiz Oozie OpenEJB OpenJPA OpenNLP OрenOffice ORC PDFBox Parquet Phoenix POI Pig Pinot Pivot Qpid Roller RocketMQ Samza Shiro SINGA Sling Solr Spark Storm SpamAssassin Struts 1 Struts 2 Subversion Superset SystemDS Tapestry Thrift Tika TinkerPop Tomcat Trafodion Traffic Server UIMA Velocity Wicket Xalan Xerces XMLBeans Yetus ZooKeeper
Commons	BCEL BSF Daemon Jelly Logging
Incubator	Taverna
Other projects	Batik FOP Ivy Log4j
Attic	Apex AxKit Beehive Bluesky iBATIS Click Continuum Deltacloud Etch Giraph Hama Harmony Jakarta Marmotta MXNet ODE River Shale Slide Sqoop Stanbol Tuscany Wave XML
Licenses	Apache License
Category

Health software

Barcoding

Bar code medication administration

Databases

Diagnostics

Bioimaging

DICOM

General	3DSlicer Drishti GIMIAS Ginkgo CADx InVesalius ITK-SNAP OsiriX VistA Imaging Voreen
Servers	Orthanc

Heuristics

Odontologic

Electronic
health records

Platforms

Terminology

Laboratory
management

Patient portals

AbbaDox
athenaCommunicator
Cerner Patient Portal
MyChart

Practice
management

Comprehensive	ClearHealth Kareo Practice Management OpenHospital RXNT
Specialty	Dentrix Open Dental SoftDent
Scheduling	AbbaDox Radix Health Kareo Zocdoc Vezeeta Dentaltap
Patient engagement	AbbaDox Kareo Vezeeta

Research

Surgical

Assistive	HipNav

Transmission

Components

History

See also

References

External links