Representing words as dense vectors in continuous space — capturing semantic relationships that bag-of-words approaches miss.
By the end of this week, you will be able to:
💻 R scripts used during class:
No server access? Download the GloVe embeddings file and run the local version above.
Due: May 26 @ 11:59 pm
In this lab you will train word embeddings directly from your Nexis Uni corpus and compare them against pre-trained GloVe embeddings. You will query nearest neighbors, explore how domain-specific training shapes the embedding space, and visualize semantic relationships in your environmental topic area.
| Type | Resource |
|---|---|
| Interactive | Word2Vec Galaxy — interactive 3D visualization of word embedding space |
| Citation | Year | Topic | Keywords |
|---|---|---|---|
| Callaghan et al. — Nature Climate Change | 2021 | Evidence & Attribution Mapping | BERT, 100k+ studies, climate impacts, geospatial attribution, evidence synthesis |
| Bingler et al. — ClimateBERT | 2022 | Domain-Adapted Climate Language Model | BERT fine-tuning, climate narratives, ESG reports, sustainability disclosure, NLP |
| Citation | Year | Topic | Keywords |
|---|---|---|---|
| Authors — Climate Knowledge or Climate Debate? | — | Climate Discourse Analysis | Word2Vec, semantic shift, expert vs. media framing, ideological variation, vector distance |
| Authors — Using Word Embeddings to Learn a Better Food Ontology | — | Environmental Lexicon Expansion | geotagged social media, food systems, land use, co-occurrence, ontology learning |
| Citation | Year | Topic | Keywords |
|---|---|---|---|
| Jeawak et al. — Ecological Informatics | — | Predicting Environmental Features | spatiotemporal embeddings, social media, species distributions, localized climate features, geotext |
| Citation | Year | Topic | Keywords |
|---|---|---|---|
| Authors — Using word embedding for environmental violation analysis | — | Oil & Gas Compliance | word embeddings, violation text, enforcement trends, shale gas, semantic distance, regulatory NLP |