The HIV Brain Sequence Database contains three categories of annotations: publication references, patient and sampling information, and sequence properties (see table below).

The publication annotations include bibliographic information identifying the study that generated the sequences. Patient sampling annotations contain information describing the individual patients, as well as clinical information at the time of sampling.

This information was obtained by manual curation of the original publications and in some cases direct communications with the study authors. In cases where multiple studies examined tissue samples from the same patient, the resulting sequences are linked to the same patient code to increase statistical power.

Sample timepoint annotations describe the patient’s clinical health status, neuropathological, neurocognitive status, CD4 counts, viral load, and anti-retroviral treatment history at the time of sampling. Clone and sequence annotations describe the individual sequences and the tissue from which they were cloned.

This includes the sequence start and end locations numbered based on alignment to the HXB2 reference genome, and tissue source coded using terms from a formal anatomical ontology. Alignment to HXB2 was performed using the HIV Sequence Locator tool located at the LANL HIV Sequence Database.

Patient code patient code
Sex gender
Risk factor HIV risk factor
Tissue bank Tissue bank distributing samples
Patient year of death patient year of death
Sampling timepoint
Sampling geo-region patient geo-region at time of sampling
Sampling country patient country at time of sampling
Sampling city patient city at time of sampling
Patient age patient age at sampling
Health status patient health status at sampling
Subtype predominant subtype at time of sampling
Drug naive has patient had ART
Antiretroviral treatment patient ART history
Viral load plasma (copies/mL) plasma viral load
Viral load brain (copies/million cells) brain viral load
Viral load lymphoid (copies/million cells) lymphoid viral load
CD4 count (cells/uL) CD4 count
Neurocognitive diagnosis neurocognitive diagnosis
Neuropathological diagnosis neuropathological diagnosis
Giant cells were giant cells present in the brain
Genbank accession Genbank accession number
GI Genbank GI number
PubMed ID Pubmed ID for original publicaiton
Sequence length sequence length
Clone name publication assigned clone name
Sample tissue class global tissue class (Brain, Blood & Lymphoid, etc…)
Sample tissue name tissue source
Sample tissue FMA code tissue FMA code
Nucleic acid type was proviral DNA or viral RNA sequenced
Start and end coordinates sequence start and end referenced to HXB2
Sequence viral sequence
The tissue source annotation is based on the FMA developed at the University of Washington by the FMATM Research Project and is provided under license from the University of Washington.