• FineCoatMummy@sh.itjust.works
    link
    fedilink
    arrow-up
    6
    ·
    9 hours ago

    Excellent point.

    For very long, I have thought vocabulary alone would be enough footprint to ID someone. If you had enough sample of their writing ofc. It’s like browser fingerprints. The words you use, and how often you use them, is a fingerprint. As UnknowableNight points out, some patterns are very unique, nearly enough alone. Yet even without those, you have enough signals. Sentence length. Whether you spell colour or color. Regional expressions. Word use frequency. Whether you bring in vocabulary used mostly in a certain profession, like medicine or law. Whether you use more paragraphs or more single liners. None alone are enough. All together, with the 100 other ones smart people can figure out? Probably enough.

    Long ago it would be too much effort, only good for targeted cases. Today? Maybe you can do it dragnet, seeking to ID every person who writes online.

    I do not know if that happens today. Yet I do not see anything to stop it.