Let’s say we have lemmy instances A, B, C.

alice from A makes a post “Hello, world” to B. What happens? How is it processed on servers A, B, C and how do users from A, B, C receive her post?

  • flamingos-cant@feddit.uk
    link
    fedilink
    English
    arrow-up
    55
    arrow-down
    1
    ·
    edit-2
    5 days ago

    alice from A makes a post “Hello, world” to B

    Alice can’t make a post to B, but I assume you mean a community on B, let’s call it foo. When Alice makes a post it first goes through A’s local API and creates the local (and canonical) version of Alice’s post. Once A has finished processing Alice’s post, it will create an ActivityPub representation of Alice’s post to send to B.

    ActivityPub is basically a bunch of assumptions laid on top of JSON. An ActivityPub ‘file’ can be divided into broadly 3 types, Object, Activity and actors.[1] These types then have subtypes; for example, both Alice and foo are actors but Alice is a Person while foo is a Group.

    A second important assumption of ActivityPub is the concept of inboxs and outboxs, but, for Lemmy, only inboxs matter. An inbox is just a URL where Lemmy can send activities and it’s something all actors have.

    So when instance A is finished processing Alice’s post, it will turn it into a Page object, wrap that in a Create activity and send it foo’s inbox.

    Round about what the JSON would look like
    {
      "@context": [
        "https://join-lemmy.org/context.json",
        "https://www.w3.org/ns/activitystreams"
      ],
      "actor": "https://a/u/alice",
      "type": "Create",
      "to": ["https://www.w3.org/ns/activitystreams#Public"],
      "cc": ["https://b/c/foo"],
      "id": "https://a/activities/create/19199919009100",
      "object": {
        "type": "Page",
        "id": "https://a/post/1",
        "attributedTo": "https://a/u/alice",
        "to": [
          "https://b/c/foo",
          "https://www.w3.org/ns/activitystreams#Public"
        ],
        "audience": "https://b/c/main",
        "name": "Hello world",
        "attachment": [],
        "sensitive": false,
        "language": {
          "identifier": "en",
          "name": "English"
        },
        "published": "2024-12-29T15:10:51.557399Z"
      }
    }
    

    .

    Now instance B will then receive this and do the same kind of processing A did when Alice created the post via the API. Once it has finished, it will turn the post back into a Page but this time wrap it in an Announce activity. B will then look at all the actors that follow the foo (i.e. are subscribed to it) and send this Announce to all of their inboxs. Assuming a user on instance C follows foo, it will receive this Announce and process it like A and B before it, creating the local version of Alice’s post.

    Edit: I made a small mistake, I said that foo wrapped the Page in an Announce, when it actually wraps the Create in an Announce.


    1. Technically, Activity and actors are themselves objects, but they’re treated differently. There’s also Collection’s which are their own type, but Lemmy doesn’t really utilise them. ↩︎

    • akkajdh999@programming.devOP
      link
      fedilink
      English
      arrow-up
      5
      ·
      6 days ago

      Thank you, very clear.

      So B will list all users subscribed to foo, look at their instances, and send the update to them.

      I assume that if someone from a new instance (D) subscribes to foo, then D will need to request all the old posts from foo, since they weren’t pushed to D?

      • flamingos-cant@feddit.uk
        link
        fedilink
        English
        arrow-up
        10
        ·
        6 days ago

        I assume that if someone from a new instance (D) subscribes to foo, then D will need to request all the old posts from foo, since they weren’t pushed to D?

        Lemmy is pretty bad about backfilling content. Communities do have outboxs, but these only list the last 50 posts and you can’t get the vote or comments on any of them. See GitHub issues #5283, #3448 and #2004.

      • Kichae@lemmy.ca
        link
        fedilink
        English
        arrow-up
        3
        ·
        5 days ago

        ActivityPub works like a magazine subscription. They don’t send you back issues for subscribing.

    • Burstar@sopuli.xyz
      link
      fedilink
      English
      arrow-up
      2
      arrow-down
      1
      ·
      5 days ago

      Why does a mastodon user get completely different profiles and history when viewed from different lemmy instances? They look like 2 completely different users when compared except for having the same @address. In fact this makes them immune from moderation if they comment from a different instance than the mod is on.

      • flamingos-cant@feddit.uk
        link
        fedilink
        English
        arrow-up
        4
        ·
        5 days ago

        Mastodon doesn’t have Group support (fep-1b12), so when they reply to a post, they don’t send it to the community’s inbox (only to the inbox of the Person they’re replying to), thus breaking Lemmy’s model of federation.

        • AbouBenAdhem@lemmy.world
          link
          fedilink
          English
          arrow-up
          4
          arrow-down
          1
          ·
          6 days ago

          Why not a binary flag or something? Is it just to avoid making it a formal part of the protocol?

          • JackbyDev@programming.dev
            link
            fedilink
            English
            arrow-up
            2
            arrow-down
            1
            ·
            edit-2
            4 days ago

            Because it is JSON-LD and that’s how JSON-LD works. It’s an extensible format. Similar to XML namespaces.

              • JackbyDev@programming.dev
                link
                fedilink
                English
                arrow-up
                1
                arrow-down
                1
                ·
                4 days ago

                I don’t understand the comment. It’s like calling the fact that firstName is in the JSON {"firstName": "Bob"} “over engineered bullshit” when they should’ve made some application specific protocol instead of using JSON. ActivityStreams and ActivityPub are built on top of JSON-LD to utilize existing libraries to represent linked data (that’s what the LD is). To specify what schemas are used there is a “context” field. There are other schemas as well. Take a look at https://schema.org/ to see them.

                If it feels over engineered it’s because it’s meant to be able to represent a wide variety of types of social media and typical interactions with them. I seriously doubt Mastodon (micro blogging) and Lemmy (link aggregation forum) would be able to interact easily if they weren’t “over engineered”.

                • akkajdh999@programming.devOP
                  link
                  fedilink
                  English
                  arrow-up
                  1
                  arrow-down
                  1
                  ·
                  edit-2
                  4 days ago

                  I don’t care, json-ld is itself overengineered, ie bloating every JSON that you send with 300 useless http:// links without an actual purpose (instead of a boolean flag or whatever) This bloated protocol doesn’t even… work properly.

          • flamingos-cant@feddit.uk
            link
            fedilink
            English
            arrow-up
            4
            ·
            6 days ago

            I actually don’t know, you’d need to ask someone privy to design decisions made with ActivityPub, like Prodromou or Lemmer-Webber. It’s definitely not to avoid making it part of the protocol, because it already is (see the link in the last comment).

              • flamingos-cant@feddit.uk
                link
                fedilink
                English
                arrow-up
                1
                ·
                2 days ago

                What about JSON-LD makes it so they have to include the “this is public” declaration in the to field instead of having an as:public property on the object? (I don’t know a whole lot about JSON-LD or RDF more broadly)

            • AbouBenAdhem@lemmy.world
              link
              fedilink
              English
              arrow-up
              2
              ·
              edit-2
              6 days ago

              Thanks—I meant “formal” as in “formal grammar”, not that it wasn’t described in the published protocol. As in, there’s nothing in the protocol’s explicit form that distinguishes between this implied meaning and a real extra recipient—so it simplifies the parsing but adds an extra post-parsing step.

  • Draconic NEO@lemmy.world
    link
    fedilink
    English
    arrow-up
    4
    ·
    4 days ago

    Think of it this way, when you make a post that post will be automatically distributed by your server to everyone who is a subscriber, depending on the type of platform that could mean subscriber to the community, or it could mean to your user account in the case of things like Mastodon. When the post is received it will be copied and re-hosted on all the servers which have subscribers.

    Exceptions to this happening are in the case of a user being banned or server being defederated, in which case the request is denied and the post isn’t re-hosted by the instance with the ban or defederation against the user or server who made the post. It should be known that bans and defederation only typically happen in extreme cases such as defending against spam, hate speech, or abusive users.

    Might be a more simple explanation but I’m trying to keep it more simple since it helps people better understand the process.

  • JackbyDev@programming.dev
    link
    fedilink
    English
    arrow-up
    2
    ·
    4 days ago

    It helps when you understand that you only ever directly interact with your instance.

    • Alice posts to A (in some community hosted on B)
    • B is federated with A so will eventually receive the post
    • C is federated with B so will eventually get the post
  • FrostyTrichs@walledgarden.xyz
    link
    fedilink
    English
    arrow-up
    11
    arrow-down
    1
    ·
    6 days ago

    The easiest way to explain it is that the instances have no native ability to crawl other instances for communities or content. For all intents and purposes, a fresh Lemmy server is on an island and all other instances are their own island until someone builds a bridge to them.

    The ability of an instance to receive content is dependent on the subscriptions users add to the database. Once the instance is aware of these other places it will begin checking them for updates and you’ll see them regularly whether you interact with them or not.

    This goes completely against what the average person is expecting and causes a lot of confusion.

    • jollyroberts@jolly-piefed.jomandoa.net
      link
      fedilink
      English
      arrow-up
      3
      ·
      edit-2
      5 days ago

      Piefed instances now do have a form of this for instance admins to populate new instances.

      Admins can:
      -pull the lemmyverse data and subscribe to a bunch of communities at once
      or
      -target a single lemmy or mbin instance, get the list of communities that instance hosts, and subscribe to a bunch of communities on that instance.

      Both have some tunable settings to allow admins control over how many communities are followed.

      Its not an end-user thing, but it should help with setting up new instances and them not being so ‘empty’.

      edit: typo

    • JubilantJaguar@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      6 days ago

      This goes completely against what the average person is expecting and causes a lot of confusion.

      But this is only true if the user looks at the All feed, correct?

      • FrostyTrichs@walledgarden.xyz
        link
        fedilink
        English
        arrow-up
        1
        ·
        6 days ago

        But this is only true if the user looks at the All feed

        It impacts what content is available to users at all. The All feed is just the visual representation of what’s actively federating.

        Let’s say you join a new instance for whatever reason with no outside awareness of how the fediverse works. If you try to search the instance for “sportball” and get zero results the natural assumption is going to be that there are no communities and no interest in that topic. The user has no idea that lemmyserver5000.com has a sportball community with thousands of users because no one with those interests ever did the work to get the content flowing in a way that they could access it intuitively. It’s a poor design IMO.

        The reason I brought it up has more to do with starting a new instance or using a smaller instance. Communities that the instance isn’t aware of (via someone previously subscribing) won’t show up at all which causes places to appear non-existent or dead by default. Someone trying a federating website for the first time isn’t going to know this, so to them, that’s all the fediverse has to offer.

        • JubilantJaguar@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          ·
          6 days ago

          OK, I see that problem. In fact I remember having the same issue myself. (Presumably this will create a secondary confusion problem for “All” subscribers, who will see the content of their feed gradually expand without explanation as other users subscribe to other foreign servers, correct? Whatever, I don’t care much about them, someone who subscribes to “All” apparently doesn’t know what they want anyway!)

          So the optimal solution here would be for each instance to preemptively connect to a whitelist of known foreign communities, perhaps? Or maybe each instance could regularly ping other servers in order to update its search database with popular communities.

        • Kichae@lemmy.ca
          link
          fedilink
          English
          arrow-up
          1
          ·
          5 days ago

          It’s a poor design if what you want to do is emulate a centralized social media service.

          But maybe we should stop trying to do that.

          • FrostyTrichs@walledgarden.xyz
            link
            fedilink
            English
            arrow-up
            1
            ·
            5 days ago

            Maybe.

            But I’d counter that it’s prohibitive to growth. People aren’t used to turning up at a domain name only to find out 90% of the content can’t be accessed without jumping through a bunch of hoops.

    • Zak@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      6 days ago

      instances have no native ability to crawl other instances for communities or content

      That’s not quite true. They don’t do it automatically or routinely, but a user can cause a server to read a post from another server by putting its URL into the search box. This can be useful for an end user to manually address a federation glitch.

      Here’s a concrete example. I was trying to post a comment via lemmy.world, but lemmy.world sits behind Cloudflare, and Cloudflare flagged its content as potentially malicious. I then posted that comment via my own Mastodon server, but push federation to lemmy.world also failed, for the same reason. I could, however cause lemmy.world to pull the comment using the search.

    • Scipitie@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      2
      ·
      6 days ago

      Does that mean that an “all” view is "onl"y all of the subscriptions/places people from my server have?

      That’s quite interesting.

      And thanks!

        • FrostyTrichs@walledgarden.xyz
          link
          fedilink
          English
          arrow-up
          1
          ·
          6 days ago

          Note that many instances either have a bot subscribed to other communities to force federation, or use something like https://lemmy-federate.com/

          FWIW this approach can be helpful but is flawed in its own ways.

          Firstly, since not all instances participate you still aren’t getting the “complete” fediverse so to speak. This becomes less of an issue as more instances join the bot program, but it’s another step that roadblocks what should be an easy and organic process.

          Secondly, the bot can pose a potential security risk depending on how it’s configured. If you use it to federate in both directions you’re subject to malicious actors spinning up tons of new communities on instances that don’t restrict user registration. This will in turn hammer the database an instance uses for EVERYTHING and eventually causes slow downs, crashes, etc. The solution to this is to only seed your communities outwardly but if everyone only does that the bot is rather useless…

          I don’t have a solution for any of this, I’m just pointing out some rather frustrating problems this platform has in its current state.

          • Rikudou_Sage@lemmings.world
            link
            fedilink
            English
            arrow-up
            1
            ·
            6 days ago

            Well, you can always defederate if an instance starts abusing it. Not that much different to the normal flow, really.

            • FrostyTrichs@walledgarden.xyz
              link
              fedilink
              English
              arrow-up
              1
              ·
              6 days ago

              you can always defederate if an instance starts abusing it

              Sure, but potentially after at least one of the instances subscribed to the bot goes down and someone realizes what’s happening. It’s incredibly easy to overwhelm a small server’s database just by subscribing to a lot of communities the normal way. The difference here is potentially any instance federating the bot in both directions is susceptible to this.

              Not that much different to the normal flow, really.

              The impact across the fediverse vs just one instance would be the main difference. Plenty of people are using that bot having no real idea of what it’s doing.

              • Rikudou_Sage@lemmings.world
                link
                fedilink
                English
                arrow-up
                1
                ·
                5 days ago

                That’s just a part of the learning process, IMO. My instance crashed many times, I’ve fixed it every time and now it’s better than before. And I don’t think I’ve had my last fuck up with the instance.

                • FrostyTrichs@walledgarden.xyz
                  link
                  fedilink
                  English
                  arrow-up
                  1
                  ·
                  5 days ago

                  And that’s fine for you, I’m not knocking the experimenting and learning process. That was the whole reason I spun up an instance myself.

                  What I’m saying is that to the other users that would be impacted by these things, it sucks. People are patient to a point but the fediverse has a lot of odd quirks that make it more difficult than it should be to use for a lot of people. Things have gotten better in the last year or so but it still feels like we’re asking people to know more than they should have to just to figure out that Lemmy isn’t empty. Many people will get frustrated and leave long before they start making excuses for a site they don’t know anything about.

                  It’s easy to sit around proclaiming that reddit sucks but the fact of the matter is that it’s easy to use and everything they have to offer is covered under one domain. Again, I don’t have the solution to these things for Lemmy, but we can’t deny that this platform is harder to use than most and a lot of people aren’t going to handle that well.

  • Illecors@lemmy.cafe
    link
    fedilink
    English
    arrow-up
    9
    arrow-down
    1
    ·
    6 days ago
    • A makes a post to B
    • B federates that post to all instances that have at least 1 user subbed to the community of the post

    All users from all instances get the post from their home instance.

    • akkajdh999@programming.devOP
      link
      fedilink
      English
      arrow-up
      4
      ·
      edit-2
      6 days ago

      Thanks but this is quite high-level.

      Okay, so Alice makes a request to A. A makes a request to B. B makes requests to all other instances.

      If you get posts from your home instance, does it mean that all instances duplicate the same database?

      • Zak@lemmy.world
        link
        fedilink
        English
        arrow-up
        3
        ·
        6 days ago

        They don’t duplicate the database in a technical sense, but when things go right, they each have a copy of the same post and comment text, and the same votes.

        • akkajdh999@programming.devOP
          link
          fedilink
          English
          arrow-up
          2
          ·
          edit-2
          6 days ago

          Do you mean that the database is not identical, but still duplicates all data, basically? (you said “they each have a copy”, I assume it’s persistent on disk). So if we have 100 lemmy instances, they all save the same post.

          • Zak@lemmy.world
            link
            fedilink
            English
            arrow-up
            3
            ·
            6 days ago

            Correct. Each server that shows the post to its users stores a copy of the post. It does not necessarily store attached media (IIRC Mastodon usually does and Lemmy usually hotlinks media).