#Modeling online discourse escalation as a state machine (dataset + labeling approach)

1 messages · Page 1 of 1 (latest)

scarlet dew
#

[D] Modeling online discourse escalation as a state machine
I've been working on a framework to model how online discussions escalate into conflict — exploring whether it can be framed as a classification or sequence modeling problem.
The core idea: treat discourse as a state machine with observable transitions.
Proposed States

Neutral — information exchange
Disagreement — divergent views, no identity friction
Identity Activation — topic shifts toward the self
Personalization — focus moves to character/flaws
Ad Hominem — rational engagement collapses
Dogpile — multi-user targeting, non-recoverable
Threats of violence — after exhausting states 1–6

Each comment gets a local state label. Threads have a global state that evolves over time.
Signals / Features

Linguistic: second-person pronoun frequency, sentiment shift, toxicity markers
Structural: unique users per target, reply velocity, thread depth
Contextual: topic sensitivity, prior state transitions

Questions for the group:

Does this work better as per-comment classification or sequence modeling (HMM / transformer over thread)?
Would you treat dogpile as a class label or an emergent graph property?
Any existing datasets that approximate this beyond toxicity classification?

White paper: https://github.com/JohannaWeb/Monarch/releases/tag/0.1.paper

scarlet dew
#

my post on it

#

github soon with the project and training data