CodelistGenerator

Installation

You can install CodelistGenerator from CRAN

install.packages("CodelistGenerator")

Or you can also install the development version of CodelistGenerator

install.packages("remotes")
remotes::install_github("darwin-eu/CodelistGenerator")

Example usage

library(dplyr)
library(CDMConnector)
library(CodelistGenerator)

For this example we’ll use the Eunomia dataset (which only contains a subset of the OMOP CDM vocabularies)

db <- DBI::dbConnect(duckdb::duckdb(), dbdir = eunomia_dir())
cdm <- cdm_from_con(db, cdm_schema = "main", write_schema = c(prefix = "cg_", schema = "main"))

Exploring the OMOP CDM Vocabulary tables

OMOP CDM vocabularies are frequently updated, and we can identify the version of the vocabulary of our Eunomia data

getVocabVersion(cdm = cdm)
#> [1] "v5.0 18-JAN-19"

CodelistGenerator provides various other functions to explore the vocabulary tables. For example, we can see the the different concept classes of standard concepts used for drugs

getConceptClassId(cdm,
                  standardConcept = "Standard",
                  domain = "Drug")
#> [1] "Ingredient"          "Quant Clinical Drug" "Branded Drug"       
#> [4] "Quant Branded Drug"  "Clinical Drug Comp"  "Branded Drug Comp"  
#> [7] "CVX"                 "Clinical Drug"       "Branded Pack"

Vocabulary based codelists using CodelistGenerator

CodelistGenerator provides functions to extract code lists based on vocabulary hierarchies. One example is `getDrugIngredientCodes, which we can use, for example, to get all the concept IDs used to represent aspirin.

getDrugIngredientCodes(cdm = cdm, name = "aspirin")
#> 
#> - aspirin (2 codes)

If we also want the details of these concept IDs we can get these like so.

getDrugIngredientCodes(cdm = cdm, name = "aspirin", withConceptDetails = TRUE)
#> $aspirin
#> # A tibble: 2 × 4
#>   concept_id concept_name              domain_id vocabulary_id
#>        <int> <chr>                     <chr>     <chr>        
#> 1   19059056 Aspirin 81 MG Oral Tablet Drug      RxNorm       
#> 2    1112807 Aspirin                   Drug      RxNorm

And if we want codelists for all drug ingredients we can simply omit the name argument and all ingredients will be returned.

ing <- getDrugIngredientCodes(cdm = cdm)
ing$aspirin
#> [1] 19059056  1112807
ing$diclofenac
#> [1] 1124300
ing$celecoxib
#> [1] 1118084

Systematic search using CodelistGenerator

CodelistGenerator can also support systematic searches of the vocabulary tables to support codelist development. A little like the process for a systematic review, the idea is that for a specified search strategy, CodelistGenerator will identify a set of concepts that may be relevant, with these then being screened to remove any irrelevant codes by clinical experts.

We can do a simple search for asthma

asthma_codes1 <- getCandidateCodes(
  cdm = cdm,
  keywords = "asthma",
  domains = "Condition"
) 
asthma_codes1 %>% 
  glimpse()
#> Rows: 2
#> Columns: 6
#> $ concept_id       <int> 4051466, 317009
#> $ found_from       <chr> "From initial search", "From initial search"
#> $ concept_name     <chr> "Childhood asthma", "Asthma"
#> $ domain_id        <chr> "Condition", "Condition"
#> $ vocabulary_id    <chr> "SNOMED", "SNOMED"
#> $ standard_concept <chr> "S", "S"

But perhaps we want to exclude certain concepts as part of the search strategy, in this case we can add these like so

asthma_codes2 <- getCandidateCodes(
  cdm = cdm,
  keywords = "asthma",
  exclude = "childhood",
  domains = "Condition"
) 
asthma_codes2 %>% 
  glimpse()
#> Rows: 1
#> Columns: 6
#> $ concept_id       <int> 317009
#> $ found_from       <chr> "From initial search"
#> $ concept_name     <chr> "Asthma"
#> $ domain_id        <chr> "Condition"
#> $ vocabulary_id    <chr> "SNOMED"
#> $ standard_concept <chr> "S"

We can compare these two code lists like so

compareCodelists(asthma_codes1, asthma_codes2)
#> # A tibble: 2 × 3
#>   concept_id concept_name     codelist       
#>        <int> <chr>            <chr>          
#> 1    4051466 Childhood asthma Only codelist 1
#> 2     317009 Asthma           Both

We can then also see non-standard codes these are mapped from, for example here we can see the non-standard ICD10 code that maps to a standard snowmed code for gastrointestinal hemorrhage returned by our search

Gastrointestinal_hemorrhage <- getCandidateCodes(
  cdm = cdm,
  keywords = "Gastrointestinal hemorrhage",
  domains = "Condition"
)
Gastrointestinal_hemorrhage %>% 
  glimpse()
#> Rows: 1
#> Columns: 6
#> $ concept_id       <int> 192671
#> $ found_from       <chr> "From initial search"
#> $ concept_name     <chr> "Gastrointestinal hemorrhage"
#> $ domain_id        <chr> "Condition"
#> $ vocabulary_id    <chr> "SNOMED"
#> $ standard_concept <chr> "S"

Summarising code use

summariseCodeUse(list("asthma" = asthma_codes1$concept_id),  
                 cdm = cdm) %>% 
  glimpse()
#> Rows: 6
#> Columns: 13
#> $ result_id        <int> 1, 1, 1, 1, 1, 1
#> $ cdm_name         <chr> "Synthea synthetic health database", "Synthea synthet…
#> $ group_name       <chr> "codelist_name", "codelist_name", "codelist_name", "c…
#> $ group_level      <chr> "asthma", "asthma", "asthma", "asthma", "asthma", "as…
#> $ strata_name      <chr> "overall", "overall", "overall", "overall", "overall"…
#> $ strata_level     <chr> "overall", "overall", "overall", "overall", "overall"…
#> $ variable_name    <chr> "overall", "Childhood asthma", "Asthma", "overall", "…
#> $ variable_level   <chr> NA, "4051466", "317009", NA, "4051466", "317009"
#> $ estimate_name    <chr> "record_count", "record_count", "record_count", "pers…
#> $ estimate_type    <chr> "integer", "integer", "integer", "integer", "integer"…
#> $ estimate_value   <chr> "101", "96", "5", "101", "96", "5"
#> $ additional_name  <chr> "overall", "source_concept_name &&& source_concept_id…
#> $ additional_level <chr> "overall", "Childhood asthma &&& 4051466 &&& conditio…

Name		Name	Last commit message	Last commit date
Latest commit History 450 Commits
.github		.github
R		R
data-raw		data-raw
data		data
docs		docs
extras		extras
inst		inst
man		man
tests		tests
vignettes		vignettes
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
CodelistGenerator.Rproj		CodelistGenerator.Rproj
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
LICENSE.md		LICENSE.md
NAMESPACE		NAMESPACE
NEWS.md		NEWS.md
README.Rmd		README.Rmd
README.md		README.md
_pkgdown.yml		_pkgdown.yml
codecov.yml		codecov.yml
cran-comments.md		cran-comments.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Repository files navigation

CodelistGenerator

Installation

Example usage

Exploring the OMOP CDM Vocabulary tables

Vocabulary based codelists using CodelistGenerator

Systematic search using CodelistGenerator

Summarising code use

About

Licenses found

Releases 16

Packages

Contributors 8

Languages

License

Licenses found

darwin-eu/CodelistGenerator

Folders and files

Latest commit

History

Repository files navigation

CodelistGenerator

Installation

Example usage

Exploring the OMOP CDM Vocabulary tables

Vocabulary based codelists using CodelistGenerator

Systematic search using CodelistGenerator

Summarising code use

About

Resources

License

Licenses found

Stars

Watchers

Forks

Releases 16

Packages 0

Contributors 8

Languages

Packages