dhlab documentation¶
dhlab
is a python library for doing qualitative and quantitative analyses of the digital texts from Nettbiblioteket (eng: “the online library”) at the National Library of Norway (NLN). Nettbiblioteket is the NLN’s digital collection of media publications.
Head to our official homepage for tutorials, how-to-guides, interactive web apps, and more information about the DH-lab group.
Installation¶
Ensure that you already have Python >=3.9 installed.
Install the latest version of the dhlab
python package in your (Unix) terminal with pip:
pip install -U dhlab
Functionality¶
Here are some of the text mining and automatic analyses you can do with dhlab
:
Build a corpus from bibliographic metadata about publications.
Retrieve word (token) frequencies from a corpus.
Fetch chunks of text (paragraphs) as bag of words from a specific publication.
Extract concordances
Calculate collocations
Retrieve n-gram frequencies per yer in a time period.
Extract occurrences of named entities.
Plot narrative graphs of word dispersions in a publication.
Try some of our examples to get started.