ICE Bahamas
Bahamian Press Reports (2002-2007)
Corpus Overview
The ICE Bahamas corpus contains press reports from Bahamian newspapers, providing access to written Standard Bahamian English from the mid-2000s.
This corpus is part of the International Corpus of English (ICE) project, representing the Press Reports (W2C) category from the Bahamas component. It has been processed using spaCy’s transformer model (en_core_web_trf) for high-quality linguistic annotation.
Key Statistics
| Metric | Value |
|---|---|
| Total Words | ~43,000 |
| Documents | 73 press reports |
| Time Period | 2002-2007 |
| Language | Standard Bahamian English |
| Source | ICE-Bahamas project (LMU Munich) |
Geographic Distribution
| Place | Documents | Percentage |
|---|---|---|
| Nassau | 43 | 59% |
| Freeport | 30 | 41% |
Temporal Distribution
| Year | Documents |
|---|---|
| 2002 | 7 |
| 2003 | 14 |
| 2004 | 10 |
| 2005 | 10 |
| 2006 | 4 |
| 2007 | 27 |
Available Attributes
Token Attributes
| Attribute | Description | Example |
|---|---|---|
word |
Exact word form | [word="Bahamas"] |
lemma |
Base form (lemma) | [lemma="visit"] |
tag |
POS tag (Penn Treebank) | [tag="NN.*"] |
head_n |
Head position (1-based, 0 for root) | Dependency parsing |
head |
Head lemma | Dependency parsing |
deprel |
Dependency relation | [deprel="nsubj"] |
ent_type |
Named entity type (ORG, PERSON, GPE, etc.) | [ent_type="GPE"] |
ent_iob |
NER IOB tag (B=begin, I=inside, O=outside) | [ent_iob="B"] |
morph |
Morphological features (UD FEATS format) | [morph=".Tense=Past."] |
Document Attributes
| Attribute | Description | Example Values |
|---|---|---|
doc.id |
Document ID | 1-73 |
doc.filename |
Original filename | Pr_01.xml, Pr_02.xml, etc. |
doc.date |
Publication date | YYYYMMDD format |
doc.year |
Publication year | 2002-2007 |
doc.place |
Publication location | Nassau, Freeport |
doc.gender |
Author gender | f, m, ?, TODO |
doc.age |
Author age | Mostly unknown |
doc.ethnicity |
Author ethnic group | Mostly unknown |
CQL Query Examples
Basic Word Search
Find all occurrences of “Bahamas”:
Find all forms of “visit”:
Named Entity Recognition
Find all location names (countries, cities):
Find all organization names:
Find all person names:
Part-of-Speech Queries
Find all proper nouns:
Find all past tense verbs:
Dependency Relations
Find subjects of verbs:
Find direct objects:
Combined Queries
Find “hotel” followed by a location within 3 words:
Find adjectives modifying “tourism”:
Filter by Location
Search within Nassau articles only:
- Use the text type filter to select
doc.place = Nassau - Run your query
Quick Reference
| Goal | CQL Query |
|---|---|
| Exact word | [word="Nassau"] |
| All forms (lemma) | [lemma="tourism"] |
| Location names | [ent_type="GPE"] |
| Organization names | [ent_type="ORG"] |
| Person names | [ent_type="PERSON"] |
| Past tense verbs | [morph=".Tense=Past."] |
| Proper nouns | [tag="NNP.*"] |
| Adjacent words | [lemma="prime"] [lemma="minister"] |
References
This corpus is part of the International Corpus of English (ICE) project. ICE-Bahamas is led by Prof. Dr. Stephanie Hackert at LMU Munich. See ice-corpora.uzh.ch for project details.
Key publications:
- Corpus description: Hackert (2010)
- Bahamian Standard English: Bruckmaier and Hackert (2011)
- Urban Bahamian Creole: Hackert (2004)
- Gullah connections: Hackert and Huber (2007)
- Newspaper usage: Oenbring (2010)