Overview

Abstract

While the superiority of language models over tra ditional search-engines has become clear in recent years, their ‘generativity’ remains a serious concern for policymakers. On the input side, it is generally acknowledged that most large models are trained on unethically sourced data. And on the output side, their tendency to hallucinate and misinform makes them unfit for domain-specific work. This paper takes the view that explainability and transparency cannot be achieved simply by putting ‘a human in the loop’. Taking legal work as a concrete site, it proposes Hyvmind – an architecture that puts ‘humans in the centre’ by recording and rewarding semantic labour through tokenised annotations. Its novelty lies in conceptualising legal research as a set of four interconnected functions (source, watch, frame and curate) around a common data-object (source-text). By storing and rewarding annotative-work through a distributed ledger system with nested states, it creates a secure, ethical and organic pathway for generating high-quality datasets for the next generation of domain-specific language models.

Keywords

tokenised annotations, semantic labour, reciprocal produsage, nested state, legal research, ontological plurality, distributed ledger

Abbreviations

Abbreviation

Full Form

AGLI

Artifically Generated Legal Information

AVC

Annotation Value Chain

Curation Contract

CID

Content Identifier

CNFT

Curation NFT

FNFT

Frame NFT

Fungible Token

HDAO

Hyvmind’s Decentralised Autonomous Organisation

IAA

Inter Annotator Agreement

ILP

Intellectual Labour Practice

Location Contract

LNFT

Link NFT

Legal Ontologies

NFT

Non Fungible Token

NLP

Natural Language Processing

NNFT

Nested NFT

Source Text

TNFT

Text NFT

UTS

Utility Token System

UVS

Unrealised Value Space

NextA. Token Economy

Last updated 17 days ago