Amethyst Lemmy
  • Communities
  • Create Post
  • Create Community
  • heart
    Support Lemmy
  • search
    Search
  • Login
  • Sign Up
poVoq@slrpnk.net to Privacy@lemmy.dbzer0.comEnglish · 9 days ago

Large-scale online deanonymization with LLMs

arxiv.org

external-link
message-square
4
link
fedilink
30
external-link

Large-scale online deanonymization with LLMs

arxiv.org

poVoq@slrpnk.net to Privacy@lemmy.dbzer0.comEnglish · 9 days ago
message-square
4
link
fedilink
We show that large language models can be used to perform at-scale deanonymization. With full Internet access, our agent can re-identify Hacker News users and Anthropic Interviewer participants at high precision, given pseudonymous online profiles and conversations alone, matching what would take hours for a dedicated human investigator. We then design attacks for the closed-world setting. Given two databases of pseudonymous individuals, each containing unstructured text written by or about that individual, we implement a scalable attack pipeline that uses LLMs to: (1) extract identity-relevant features, (2) search for candidate matches via semantic embeddings, and (3) reason over top candidates to verify matches and reduce false positives. Compared to classical deanonymization work (e.g., on the Netflix prize) that required structured data, our approach works directly on raw user content across arbitrary platforms. We construct three datasets with known ground-truth data to evaluate our attacks. The first links Hacker News to LinkedIn profiles, using cross-platform references that appear in the profiles. Our second dataset matches users across Reddit movie discussion communities; and the third splits a single user's Reddit history in time to create two pseudonymous profiles to be matched. In each setting, LLM-based methods substantially outperform classical baselines, achieving up to 68% recall at 90% precision compared to near 0% for the best non-LLM method. Our results show that the practical obscurity protecting pseudonymous users online no longer holds and that threat models for online privacy need to be reconsidered.
  • quediuspayu@lemmy.dbzer0.com
    link
    fedilink
    arrow-up
    1
    ·
    8 days ago

    I wonder if this thing can be used to instead of finding an anonymous account belongs to, what anonymous accounts someone has.

    • poVoq@slrpnk.netOP
      link
      fedilink
      arrow-up
      3
      ·
      8 days ago

      Most likely? It just correlates accounts as far as I can tell.

Privacy@lemmy.dbzer0.com

privacy@lemmy.dbzer0.com

Subscribe from Remote Instance

Create a post
You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !privacy@lemmy.dbzer0.com

Welcome! This is a community for all those who are interested in protecting their privacy.

Rules

PS: Don’t be a smartass and try to game the system, we’ll know if you’re breaking the rules when we see it!

  1. Be civil and no prejudice
  2. Don’t promote big-tech software
  3. No apathy and defeatism for privacy (i.e. “They already have my data, why bother?”)
  4. No reposting of news that was already posted
  5. No crypto, blockchain, NFTs
  6. No Xitter links (if absolutely necessary, use xcancel)

Related communities:

Some of these are only vaguely related, but great communities.

  • !opensource@programming.dev
  • !selfhosting@slrpnk.net / !selfhosted@lemmy.world
  • !piracy@lemmy.dbzer0.com
  • !drm@lemmy.dbzer0.com
Visibility: Public
globe

This community can be federated to other instances and be posted/commented in by their users.

  • 225 users / day
  • 1.09K users / week
  • 2.16K users / month
  • 5.65K users / 6 months
  • 1 local subscriber
  • 5.32K subscribers
  • 808 Posts
  • 7K Comments
  • Modlog
  • mods:
  • Otter@lemmy.ca
  • shaytan@lemmy.dbzer0.com
  • BE: 0.19.12
  • Modlog
  • Instances
  • Docs
  • Code
  • join-lemmy.org