ICE Bahamas

Bahamian Press Reports (2002-2007)

Corpus Overview

The ICE Bahamas corpus contains press reports from Bahamian newspapers, providing access to written Standard Bahamian English from the mid-2000s.

This corpus is part of the International Corpus of English (ICE) project, representing the Press Reports (W2C) category from the Bahamas component. It has been processed using spaCy’s transformer model (en_core_web_trf) for high-quality linguistic annotation.

Key Statistics

Metric Value
Total Words ~43,000
Documents 73 press reports
Time Period 2002-2007
Language Standard Bahamian English
Source ICE-Bahamas project (LMU Munich)

Geographic Distribution

Place Documents Percentage
Nassau 43 59%
Freeport 30 41%

Temporal Distribution

Year Documents
2002 7
2003 14
2004 10
2005 10
2006 4
2007 27

Available Attributes

Token Attributes

Attribute Description Example
word Exact word form [word="Bahamas"]
lemma Base form (lemma) [lemma="visit"]
tag POS tag (Penn Treebank) [tag="NN.*"]
head_n Head position (1-based, 0 for root) Dependency parsing
head Head lemma Dependency parsing
deprel Dependency relation [deprel="nsubj"]
ent_type Named entity type (ORG, PERSON, GPE, etc.) [ent_type="GPE"]
ent_iob NER IOB tag (B=begin, I=inside, O=outside) [ent_iob="B"]
morph Morphological features (UD FEATS format) [morph=".Tense=Past."]

Document Attributes

Attribute Description Example Values
doc.id Document ID 1-73
doc.filename Original filename Pr_01.xml, Pr_02.xml, etc.
doc.date Publication date YYYYMMDD format
doc.year Publication year 2002-2007
doc.place Publication location Nassau, Freeport
doc.gender Author gender f, m, ?, TODO
doc.age Author age Mostly unknown
doc.ethnicity Author ethnic group Mostly unknown

CQL Query Examples

Named Entity Recognition

Find all location names (countries, cities):

[ent_type="GPE"]

Find all organization names:

[ent_type="ORG"]

Find all person names:

[ent_type="PERSON"]

Part-of-Speech Queries

Find all proper nouns:

[tag="NNP.*"]

Find all past tense verbs:

[morph=".Tense=Past."]

Dependency Relations

Find subjects of verbs:

[deprel="nsubj"]

Find direct objects:

[deprel="dobj"]

Combined Queries

Find “hotel” followed by a location within 3 words:

[lemma="hotel"] []{0,3} [ent_type="GPE"]

Find adjectives modifying “tourism”:

[tag="JJ.*"] [lemma="tourism"]

Filter by Location

Search within Nassau articles only:

  1. Use the text type filter to select doc.place = Nassau
  2. Run your query

Quick Reference

Goal CQL Query
Exact word [word="Nassau"]
All forms (lemma) [lemma="tourism"]
Location names [ent_type="GPE"]
Organization names [ent_type="ORG"]
Person names [ent_type="PERSON"]
Past tense verbs [morph=".Tense=Past."]
Proper nouns [tag="NNP.*"]
Adjacent words [lemma="prime"] [lemma="minister"]

References

This corpus is part of the International Corpus of English (ICE) project. ICE-Bahamas is led by Prof. Dr. Stephanie Hackert at LMU Munich. See ice-corpora.uzh.ch for project details.

Key publications:

  • Corpus description: Hackert (2010)
  • Bahamian Standard English: Bruckmaier and Hackert (2011)
  • Urban Bahamian Creole: Hackert (2004)
  • Gullah connections: Hackert and Huber (2007)
  • Newspaper usage: Oenbring (2010)
Bruckmaier, Elisabeth, and Stephanie Hackert. 2011. “Bahamian Standard English: A First Approach.” English World-Wide 32: 174–205.
Hackert, Stephanie. 2004. Urban Bahamian Creole: System and Variation. Amsterdam, Philadelphia: John Benjamins.
———. 2010. ICE Bahamas: Why and How?” ICAME Journal 34: 41–53. https://www.ice-corpora.uzh.ch/en/joinice/Teams/iceba.html.
Hackert, Stephanie, and Magnus Huber. 2007. “Gullah in the Diaspora: Historical and Linguistic Evidence from the Bahamas.” Diachronica 24: 279–325.
Oenbring, Raymond A. 2010. “Corpus Linguistic Studies of Standard Bahamian English: A Comparative Study of Newspaper Usage.” The International Journal of Bahamian Studies 16: 51–62.