Software Architecture

Software Architecture

Posts 1-4 of 4
  • Stefan Schubert-Peters
    Stefan Schubert-Peters    Premium Member
    The company name is only visible to registered members.
    Compromises for Scalability Reasons
    Hi,

    as this Forum still is quite empty and lifeless I decided to start a discussion on a topic that isn't covered, yet interesting. Let's see if XING is the right place for technical discussions. If not we might switch to some blog as well ;-)

    Big internet portals like Google, Ebay, Amazon or Facebook (or smaller ones like us) have to handle millions of users a day. For a system architect in such an environment it is one of the major challenges to define or evolve an architecture such that it is and remains responsive even with millions of users at peaks. As a provider of services of course there are other non-functional requirements that are to be maintained, like:
    - consistency over lots of servers
    - availability and failover
    - ACIDs for business-critical systems
    - deployability

    There might be functionally-driven requirements as well like:
    - portal pages require information from dozens of modules
    - complex user session states

    As you easily may see these points (and many others that may come to your minds) are not fitting together at all. Each is to be balanced carefully in order to not break the other. So you need a trade-off.

    Apart from well-known products that promise to solve many issues at a time: Where do you compromise? Where do you make a cut? What is more important, what less?
    Do you tame your product managers to avoid module-spreading on portal pages?
    Do you trade away some consistency OR session state?
    Do you outsource the problem to the user (e.g. use JavaScript massively)?
    Do you make compromises with transactions and rely on 99%?

    How do you, like to or could imagine to deal with issues like scalability in web environments?

    Of course questions on the topic are welcome as well :-)

    Kind regards,
    Stefan Schubert
  • Wolfgang Rössler
    Wolfgang Rössler    Premium Member   Group moderator
    The company name is only visible to registered members.
    Re: Compromises for Scalability Reasons
    Hi Stefan,

    unfortunately I don't have much experience with internet portals. I am mostly doing architecture for embedded and safety critical systems. But I am of course curious how to solve this problems.
    One thought I have: One of the central points is probably that only a small percentage of the users actually require write access at the same time. Does anyone have numbers how many write-accesses per second occure at amazon or ebay? The problem may be, that most of the writes access the same parts of the database.
    Perhaps you can explain some basic strategies for solving such problems.

    Greets
    Wolfgang
  • Stefan Schubert-Peters
    Stefan Schubert-Peters    Premium Member
    The company name is only visible to registered members.
    Re^2: Compromises for Scalability Reasons
    3 months and no further answers :-( Thanks for yours, Wolfgang!

    To answer your question: I don't know how many write accesses ebay or amazon have to deal with. Anyway your approach, thinking about write and read separation is the right idea for web-based architectures.

    As you can figure out on http://highscalability.com/links/weblink/24 Ebay and Amazon are having quite different architectures.
    Amazon really tries a "perfect" approach. Everything is built around a service oriented approach, ensuring consistency and failover at the same time. With EC2 service others can profit from this mature bur complex architecture. From this architecture you would expect very low compromises.
    Ebay instead tries to partition everything. With availability (of course) being the most important thing a lot of mechanisms are implemented to avoid downtime. Examples for partitioning are your mentioned ready-write-partitioning (functional splitting like separating auction item insertion from auction search) and logical partitioning (simple example having all user.id % 1 == 0 on one server and == 1 on the other), not to forget redundancy. Virtualization makes deployment on thousands of servers easy. Compromises are transactions (completely avoided), application state (no state means perfect load balancing), asynchroneous communication between partitions to decouple.

    Those are examples you can read everywhere. My question: How would / do you actually do it? I hope there is someone in this group actually having such topics to discuss. Otherwise this group is pretty incomplete :-D

    Kind Regards
    Stefan
  • Post visible to registered members