dhlab.text.corpus_collection

Module Contents

Classes

CorpusCollection

A class for handling a collection of corpora.

API

class dhlab.text.corpus_collection.CorpusCollection(corpora: Optional[Dict[str, dhlab.text.corpus.Corpus]] = None)

A class for handling a collection of corpora.

Initialization

Initialize the class with a dictionary of corpora.

__getitem__(key: str) dhlab.text.corpus.Corpus

Get a corpus by name.

__setitem__(key: str, value: dhlab.text.corpus.Corpus)

Set a corpus by name.

__repr__() str

Print the names of the corpora.

__iter__()

Iterate over the names of the corpora.

__len__() int

Return the number of corpora.

__contains__(key: str) bool

Check if a corpus is in the collection.

add(name: str, corpus: dhlab.text.corpus.Corpus)

Add a corpus to the collection.

remove(name: str)

Remove a corpus from the collection.

get(name: str) dhlab.text.corpus.Corpus

Get a corpus by name.

show_corpora() Dict[str, dhlab.text.corpus.Corpus]

Show the corpora in the collection.

concat_corpora() dhlab.text.corpus.Corpus

Concatenate all corpora in the collection into a single corpus.