eBay announced last week that it has been custom-building its own servers in order to replatform and bring up to date its backend infrastructure; and following the lead of Facebook, the eCommerce giant will be open-sourcing its designs. The open sourcing will take place in Q4 of this year. In a blog post on the eBay site, Mazen Rawashdeh, VP of Platform Engineering, wrote, “eBay is announcing our own custom-designed servers, built by eBay, for eBay”.
The digital sales site revealed that it has been working on the custom-build of its own servers as part of “an ambitious three-year effort”, which it is now halfway through. An intense research and testing process of its applications and partnerships with its principal hardware vendors over the last nine months has helped the company craft its own set of servers to its specific needs. The infrastructure needed to run its huge global marketplace (175 million active users and over 1.1 billion live listings on eBay) is necessarily also huge. 300 billion data queries are processed each day, and its data footprint extends past 500 petabytes. The redesign of eBay’s servers specialized for its own complex marketplace will allow the company a new level of control over the infrastructure that runs and maintains the platform, directly impacting customer experience.
The redesign of eBay’s backend infrastructure has further involved a migration to an edge computing architecture and decentralizing its cluster of data centers through a new PoP strategy, along with the design and construction of its own hardware components and a new AI-driven engine.
In so doing, the company aims to reduce its reliance on third parties and guarantee itself a greater measure of reliability and improved performance. Rawashdeh writes, “midway through our journey, we are already seeing meaningful results that offer greater predictability, more control and needed flexibility”. He also boasts that while for other companies, such extensive replatforming would typically involve a significant investment, “eBay has been able to replatform on an aggressive timeline without incremental cost”. Indeed, it has already re-invested savings from the replatforming back into the business.
Tackling Each Layer of the Technology Stack
As eBay tackled the ambitious project, its focus necessarily spread across the entire technology stack – both its physical and logical layers – as each layer is intertwined with those surrounding it. In Rawashdeh’s words, “The stack is like connective tissue, you cannot isolate one of the layers; you must advance them together. To make a meaningful impact, the transformation should be cohesive and orchestrated from end-to-end”. To achieve this, the company’s management and development team systematically combed through each layer of its technology stack, taking a close look at its current efficiencies and capabilities in order to consider how to improve existing solutions.
As part of eBay’s overall efforts to replatform, it is taking significant strides away from OpenStack. In May 2017 at the OpenStack Summit held in Boston, eBay said 95% of all its traffic ran on its OpenStack cloud, which then managed 167,000 virtual machines and 4,000 apps. Since then, eBay has significantly moved away from OpenStack as part of its year-three infrastructure initiative.
Decentralizing Data Centers and the Migration to the Edge
At the physical foundation level, the company’s first efforts were to decentralize its cluster of data centers within the U.S. and migrate to an edge architecture. The intention being to create a faster, more reliable user experience, saving up to 600-800 milliseconds of load time. By bringing its online servers and data closer to the endpoint (the customer), eBay hopes to optimize the user experience through lowering latency and utilizing dynamic and static caching at the edge.
Rawashdeh writes, “It took us nine months to build our prototype and to deploy our custom hardware. With this shift, we are homogenizing our infrastructure, leading to significant development and operational efficiencies”.
The Data Layer and Build of NuData
At the data layer, eBay is also investing in constructing more customized models. It has drawn on open source technologies to build NuData, a fault tolerant, geo-distributed object and data store. This will enable eBay to distribute data on a more local level improving customer and partner experience, including the ability to offer data isolation options according to each region’s specific requirements.
In an interview with Forbes last year, Seshu Adunuthula, Senior Director of Analytics Infrastructure at eBay said, “Data is eBay’s most important asset.” Given the volume of data that moves through eBay every day, this should not be a surprising claim. The key to eBay’s future success, however, depends on how quickly it can turn data into a personalized experience that drives sales. The need for a platform capable of storing a huge amount of data that varied by type has pushed the company to migrate from a traditional data warehouse structure towards what it dubs data lakes. The company retains nine quarters of historical trends data to offer insights on items like year-over-year growth; it also has to analyze data in real-time to assist shoppers across the selling cycle. THe company’s work with open-source platform Hadoop allowed eBay to scale and design product enhancements.
The company has been working with various open source tools on its data efforts, including Apache Spark, Storm, Kafka, and Hortonworks HDF. The data services layer of eBay’s strategy supports functions that allow a company to access and query data. The company’s data analysts are able to search information tags that have been linked to the data (metadata) and make it consumable to a large number of people with the right levels of permissions and security (data governance).eBay is additionally deploying Presto, an interactive query engine on Hadoop. The company has been at the vanguard of using big data solutions and actively contributing its knowledge back to the open source community for some time.
Apache Kylin
As far back as 2014, eBay open sourced Kylin, its distributed analytics engine offering multiple advanced features for big data analytics. Its SQL interface and multi-dimensional analysis (OLAP) is aimed at enabling the use of SQL-compatible tools and helping accelerate analytics on Hadoop in order to support extremely large datasets. As eBay’s data volume increased and its user base diversified, the company realized that there were no available external products to exactly meet its specific requirements, particularly in the open-source Hadoop community. In order to meet its emerging business needs, eBay decided instead to build the Kylin platform from scratch then once it had proved successful in production, to open source it.
Following Kylin’s open-sourcing, it became an Apache top-level project in 2015. Since then, it has been rapidly developed by the Apache Kylin community and deployed at many companies all over the globe. Concurrently, eBay has seen a rapid increase in the number of data analysis applications deployed on the Apache Kylin platform within eBay. As of August last year, eBay reported that there were “144 cubes in the production environment of eBay, and this number continues to grow on an average of 10 per month. The largest cube is about 100T”.
eBay’s In-House AI Engine
Another new build has been its own in-house AI engine aimed at increasing productivity via the enabling of greater collaboration across the company and improved training using the massive data set that moves through eBay daily.
The AI engine is creating new opportunities for eBay’s data science and engineering teams to more quickly experiment with iterating and building new products and customer experiences such as using VR and mixed reality to enhance the retail experience. Personalization is also made more possible via the use of AI tools. New features under the AI umbrella include computer vision, Image Search and social sharing techniques. Development time on new features has been slashed “from weeks to hours” through use of the AI engine.
eBay hired a new VP and chief scientist of AI earlier this year. Jan Pederson joined the company from Twitter with the goal of focusing on computing vision, personalization functions, dynamic pricing and search features. The company is focused on embedding AI and ML capabilities across its front and backend processes for a diverse range of purposes – from improving its contextual advertising to bring in new and returning customers to the site to using computer vision and deep learning to offer “personalized, immersive shopping experiences”.
Event Sourcing within eBay’s Continuous Delivery Team
Event Sourcing is an area that eBay has written about extensively this year, demonstrating its use within its continuous delivery program. Event Sourcing allows the CD team to “easily scale the processing of incoming data” as it enables the sequential processing of each event, making concurrency issues easier to address. It also enables the separation of code into parts that record information as it comes in, “parts that calculate our final model and parts that act when our model changes makes testing, designing and debugging this code much simpler”.
The Wider Picture
The latest announcement from eBay is one in a number of technology-related announcements this year. In August, the company unveiled a series of API updates aimed at bolstering marketplace growth through improved buyer experiences. Retailers, both online and in stores, are struggling against the mighty Amazon and improving technology offerings are proving essential in keeping up and trying to gain an edge in the market.
The company has also been focused on creating an innovative culture within the company, building an atmosphere of learning in which creativity is nurtured and risk aversion is discouraged. A recent study from the Boston Consulting Group revealed that those companies which deliberately foster a more supportive, transformative digital culture at the internal level make more rapid strides in digital transformation.
Part of the e-Commerce giant’s growth plan for the last two years has been highly concentrated on the developer community. The company acknowledged that external developers drove over $9.6 billion of gross merchandise volume globally and generated over 2.1 billion new listings in the first half of this year.
Why Open Source?
Across its own redevelopment, eBay has relied upon open source technologies to help fuel its transformation – using open source programs such as Kubernetes, EnvoyProxy, Apache Kafka and MongoDB to customize for its own needs. Now the company wants to “give back” by sharing its own innovations and new technologies with the broader engineering community. It also hopes that as developers build on its open sourced designs, they will better eBay’s offerings. eBay is incorporating an open source philosophy into its software development strategies in addition to its hardware by enhancing its API capabilities and investing in its developer program and support services.
Companies of all sizes are increasingly taking up open source software. Some are becoming vocal advocates of its use and even investing money in projects and working with developers; Facebook, for instance has its own open source program, which encourages others to release their code as open source and is actively working with the community to support open source projects.
In a FOSS talk at the Rochester Institute of Technology last November, Christine Abernathy, a Facebook developer and open source advocate, explained why open source is a crucial part of the work that the company does. She said that open sourcing its technology is part of Facebook’s overall mission to “share solutions that could help others”. As importantly, it can offer a gateway to accelerating innovation, the generation of better software and the reduction of redundant code writing by facilitating development and sharing across teams.