dhlab class demo#
import dhlab as dh
Corpus#
dh.Corpus??
Init signature:
dh.Corpus(
doctype=None,
author=None,
freetext=None,
fulltext=None,
from_year=None,
to_year=None,
from_timestamp=None,
to_timestamp=None,
title=None,
ddk=None,
subject=None,
lang=None,
limit=10,
order_by='random',
)
korpus = dh.Corpus(doctype="digibok", title="Dracula")
korpus.frame.iloc[:5, [0,1,2,3,9]]
dhlabid | urn | title | authors | year | |
---|---|---|---|---|---|
0 | 100163812 | URN:NBN:no-nb_digibok_2013062438048 | Dracula : et dansedrama i tre akter basert på ... | Levin , Mona / Stoker , Bram | 2002 |
1 | 100346414 | URN:NBN:no-nb_digibok_2017091805047 | Dracula | MacDonald , Eric / Stoker , Bram | 1983 |
2 | 100462841 | URN:NBN:no-nb_digibok_2008090904123 | Dracula | Fletcher-Watson , Jo / Kolstad , Henning / Bac... | 1998 |
3 | 100547952 | URN:NBN:no-nb_digibok_2011071108102 | Dracula | Stoker , Bram / Carling , Bjørn | 2006 |
4 | 100276465 | URN:NBN:no-nb_digibok_2016011248064 | Dracula | Mucci , Michael / Valgermo , Finn / Stoker , B... | 2009 |
Conkordans#
dh.Concordance??
dh.Concordance(corpus=None, query=None, window=20, limit=500)
korpus.conc("Dracula").show()
link | concordance | |
---|---|---|
97 | URN:NBN:no-nb_digibok_2008090904123 | DRACULA |
410 | URN:NBN:no-nb_digibok_2011071108102 | DRACULA |
107 | URN:NBN:no-nb_digibok_2008090904123 | Jakten Morris og Seward fulgte etter Dracula ved å ri langs bredden av Bistritza . De red opp til Borgopasset... |
49 | URN:NBN:no-nb_digibok_2013013108024 | Bram Stoker Dracula |
120 | URN:NBN:no-nb_digibok_2014071506008 | ... hans egen jord ! Det var så sannelig en Dracula . Det var han hvis egen uverdige bror , etter... |
414 | URN:NBN:no-nb_digibok_2011071108102 | DRACULA |
109 | URN:NBN:no-nb_digibok_2008090904123 | ... ^ j | ^ 9 9 j stearinlys representerer Dracula var The Vampyre ( 1819 ) , en / /... |
24 | URN:NBN:no-nb_digibok_2010020103031 | K Dracula ... ; |
94 | URN:NBN:no-nb_digibok_2008090904123 | Vlad Dracula |
119 | URN:NBN:no-nb_digibok_2014071506008 | I mellomtiden må jeg prøve å finne ut alt jeg kan om grev Dracula , for det kan være nyttig... |
Frekvens#
dh.Counts??
dh.Counts(corpus=None, words=None)
korpus.count().display_names()
Dracula | Dracula : fritt etter Bram Stokers roman | Bram Stoker's Dracula | Dracula | Dracula | Dracula : et dansedrama i tre akter basert på Bram Stokers roman | Dracula | Dracula | Dracula | |
---|---|---|---|---|---|---|---|---|---|
. | 646.0 | 3268.0 | 578.0 | 8495.0 | 8447.0 | 99.0 | 8459.0 | 127.0 | 832.0 |
, | 500.0 | 1384.0 | 368.0 | 9133.0 | 9659.0 | 144.0 | 9636.0 | 80.0 | 678.0 |
og | 288.0 | 524.0 | 147.0 | 6206.0 | 6350.0 | 110.0 | 6326.0 | 0.0 | 261.0 |
i | 265.0 | 449.0 | 250.0 | 2137.0 | 3092.0 | 77.0 | 3066.0 | 45.0 | 187.0 |
^ | 154.0 | 0.0 | 189.0 | 0.0 | 2.0 | 0.0 | 0.0 | 48.0 | 1.0 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
forts. | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 10.0 |
Nu | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 13.0 |
Pause | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 13.0 |
Ton | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 17.0 |
onathan | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 25.0 |
22582 rows × 9 columns
#
from dhlab import totals
tot = totals()
tot.freq
. 7655423257
, 5052171514
i 2531262027
og 2520268056
- 1314451583
...
tidspunkter 110667
dirigenter 110660
ondartet 110652
kulturtilbud 110652
trassig 110651
Name: freq, Length: 50000, dtype: int64
(korpus.coll("Dracula").frame.counts / tot.freq).sort_values(ascending=False).to_frame().head(10)
0 | |
---|---|
Dracula | 0.000289 |
grev | 0.000093 |
Grev | 0.000059 |
Jonathan | 0.000046 |
tyrkerne | 0.000030 |
uverdige | 0.000026 |
vedlagte | 0.000024 |
hungersnød | 0.000024 |
Helsing | 0.000023 |
Mina | 0.000021 |
Ngram#
??dh.Ngram
dh.Ngram(
words=None,
from_year=None,
to_year=None,
doctype='bok',
mode='relative',
lang='nob',
**kwargs,
)
dh.Ngram(["Dracula", "Frankenstein"], from_year=1880, to_year=2020)
