Pathology Informatics
Share |

Pathology Informatics Terminology Reference

April 25 2020


The following terminologies related to the field of pathology informatics were put together by ACVP Pathology Informatics Education Committee.



Indication of the position and/or outline of structures or objects within digital images, usually produced by humans using a computer mouse or drawing tablet. Annotations may have associated labels and possible other meta-data. Annotations can be manually generated or can be established by algorithm tools (Abels et al. 2019).


Artificial intelligence (AI)

A branch of computer science dealing with the simulation of intelligent behavior in computers (Abels et al. 2019).



A feature/variable/relationship that can be used to describe some aspect of an instance.


Bayesian networks

A probabilistic graphical model of a topic/problem that can use Bayesian calculations to infer probabilistic states of the nodes within the model.


Cloud computing

The practice of using a network of remote sensors hosted on the internet to store, manage, and process data, rather than a local server or a personal computer (Abels et al. 2019).


Computational pathology (CPATH)

Discipline at the intersection of digital image analysis, biomarkers, proteomics, genomics, outcome measures, etc. (“big data”), which requires algorithms or statistical modeling techniques to process/analyze for scientific insights (e.g., patterns, relationships). A branch of pathology that involves computational analysis of a broad array of methods to analyze patient specimens for the study of disease. Extraction of information from digitized pathology images in combination with their associated meta-data, typically using AI methods such as deep learning (Abels et al. 2019).


Convolutional neural network (CNN)

A type of deep neural network particularly designed for images. It uses a kernel or filter to convolve an image, which results in features useful for differentiating images (Abels et al. 2019).  


Decision support system

A computer program that helps guide the user in medical decision making by using relevant patient data in a variety of different types of algorithms (rule based, logic based, probabilistic, hierarchical, relational etc).  Also called computer-aided or computer-assisted diagnosis.


Decision tree

A learning algorithm that constructs decision branching points within the dataset that will sort the classification of the case instance into an outcome or classification category.


Deep learning (DL)

A subset of machine learning composed of algorithms that permit software to train itself to perform tasks by exposing multilayered artificial neural networks to vast amounts of data. Data are fed into the input layers and are sequentially processed in a hierarchical manner with increasing complexity at each layer, modeled loosely after the hierarchical organization in the brain. Optimization functions are iteratively trained to shape the processing functions of the layers and the connections between them (Abels et al. 2019).



DICOM® (Digital Imaging and Communications in Medicine) is the international standard to transmit, store, retrieve, print, process, and display medical imaging information.


Digital pathology

A blanket term that encompasses tools and systems to digitize pathology slides and associated meta-data, their storage, review, analysis, and enabling infrastructure (Abels et al. 2019).



The conversion of continuous data by “binning it” into discrete values as required by some machine learning algorithms.


Entity- attribute-value triples

Used to describe related pieces of data as might be expressed in a relational database as defined by some “entity” an “attribute” and the attribute’s “value”.



A single piece of data or some instance.


False positive rate

False positive rate is false positives (FP) divided by total number of negatives.

False positive = [FP/(FP+TN)]x100


Gold standard

The practical standard that is used to capture ‘ground truth’. The gold standard may not always be perfectly correct, but in general is viewed as the best approximation (Abels et al. 2019).


Ground truth

A category, quantity, or label assigned to a dataset that provides guidance to an algorithm during training. Depending on the task, the ground truth can be a patient- or slide-level characterization or can be applied to objects or regions within the image. The ground truth is an abstract concept of the ‘truth’ (Abels et al. 2019). 


Image analysis

Application of analytical tools and algorithms to digitized (microscopic) images to characterize features and produce quantitative output. A method to extract typically quantifiable information from images (Abels et al. 2019).



The interdisciplinary study of how data is acquired, stored, retrieved, analyzed, and presented in such a way as to turn data into information to improve health outcomes.



Usually one event or item in a dataset.


Machine learning (ML)

A branch of AI in which computer software learns to perform a task by being exposed to representative data. Includes algorithms that allow computers to identify and “learn” patterns from large amounts of data, without being explicitly programmed/directed to look for specific features. Applications include image analysis and computational pathology (Abels et al. 2019).



In the context of digital pathology, the term meta-data describes descriptive data associated with the individual, samples or slide. They include image acquisition information, patient demographic data, pathologist annotation or classification, or outcome data from treatment. Typically, meta-data are entries that allow searches in databases. Highly complex, large, multiple-time-point associated data such as longitudinal image data (such as radiology) or genomic data, are not usually classified as “meta-data”.



Typically terms or concepts within a graphical model where some decision or probabilistic state can be determined.



A set of concepts and categories in a subject area or domain that shows their properties and the relations between them.


Pathology Informatics

Pathology informatics is conveniently defined as the study and management of pathology information, information systems, and process (or workflows) (Lee et al. 2013).



An information retrieval analytic defined as {Precision = [TP/(TP+FP)]x100}.



An information retrieval analytic that is the same as the true positive rate in the medical domain {Recall = [TP/(TP+FN)]x100}.


Relational database

A method of electronically storing related pieces (rows) of data within various tables of the database that help minimize storage space requirements and facilitates efficient retrieval and reassembly of the data without reordering of the tables.


Relational network

Similar to a sematic network but relationships are typically causal in nature.


Semantic network

Representation of knowledge in a domain expressed by a collection of relationships used to define its terminology.


Supervised machine learning

Supervised machine learning is used to train a model to predict an outcome or to classify a dataset based on label associated with a data point (i.e. ground truths). An example of supervised machine learning includes the design of classifiers to distinguish between benign and malignant regions based on manual annotations (Abels et al. 2019).



A defined nomenclature for a field of study.


Training set

A collection of data that is associated with denied outcomes used for training supervised machine learning algorithms.


True positive rate

True positive rate is true positives (TP) divided by the total number of positives.

True positive rate = [TP/(TP+FN)]x100


Unsupervised machine learning

Unsupervised machine learning seeks to identify natural divisions in a dataset without the need for ground truth, often using methods such as cluster analysis or pattern matching. Examples of unsupervised machine learning include identification of images with similar attributes or the clustering of tumors into subtypes (Abels et al. 2019).


Virtual microscopy

Viewing of digital tissues or specimens using computer and monitor in place of a traditional microscope (also referred to as “telepathology” in diagnostic setting).


Whole slide image

Digital representation of an entire histopathologic glass slide, digitized at microscope resolution. These whole slide scans are typically produced using slide scanners and viewed through slide scan viewing software in a way that mimics traditional microscopy.




Abels et al.

oComputational pathology definitions, best practices, and recommendations for regulatory guidance: a white paper from the Digital Pathology Association. J Pathol 2019; 249:286-294.

Aeffner et al.

·Introduction to digital image analysis in whole-slide imaging: a white paper from the Digital Pathology Association. J Path Inform 2019; 10:9.



·      Kountchev, R. and Iantovics, B.L., 2013. Advances in intelligent analysis of medical data and decision support systems. Springer, Heidelberg ; New York.

·      Lee R.E., Le L.P., Gilbertson J. (2013). Pathology Informatics. In: Cheng L., Zhang D., Eble J. (eds) Molecular Genetic Pathology. Springer, New York, NY.

·      Shortliffe, E.H. and Cimino, J.J., 2006. Biomedical informatics : computer applications in health care and biomedicine, 3rd ed. Springer, New York, NY.

·      Shortliffe, E.H., 2001. Medical informatics: computer applications in health care and biomedicine, 2nd ed. Springer, New York.

·      Sinard JH, ed. Practical Pathology Informatics. New York, NY: Springer Science + Business Media, Inc.; 2006.








login | donate

2424 American Lane | Madison, WI 53704 | Phone: 608.443.2466 | Fax: 608.443.2474 | Email:
ACVP Privacy Policy

©Copyright 2020 American College of Veterinary Pathology