kimutanskのブックマーク - はてなブックマーク

Dynamometer: Scale Testing HDFS on Minimal Hardware with Maximum Fidelity

kimutansk 2018/02/12

HDFSのFSImageと、AuditLogを本番クラスタから読み込んで、アクセスパターンを再現、NameNodeの精度の高いパフォーマンステストを実行するというものですか。この考え方自体は色々応用できそうですね。

HDFS

リンク

Introducing and Open Sourcing Ambry

Introducing and Open Sourcing AmbryLinkedIn’s New Distributed Object Store Media content has become ubiquitous around the web and almost all of Linkedin's new features interact with media in some form or the other. Profile photos, em ail attachments, logos, and influencer posts are a few examples of where photos, videos, PDFs, and other media types get uploaded and displayed to the end user. These

kimutansk 2016/05/17

CDN使用やストリーミング配信を前提とした大小サイズに対応したオブジェクトストレージプロダクト。メタデータ＆実ファイル保持と、ハンドリングする側で切り分けた構成と。

リンク

Kafka Ecosystem at LinkedIn

Apache Kafka is a highly scala ble messaging system that plays a critical role as LinkedIn’s central data pipeline. Kafka was developed at LinkedIn back in 2010, and it currently handles more than 1.4 trillion messages per day across over 1400 brokers. Kafka’s strong durability and low latency have enabled us to use Kafka to power a number of newer mission-critical use cases at LinkedIn. These incl

kimutansk 2016/04/23

Linkedinでの #ApacheKafka の使い方。Confluent Platformの記事に独自OSSが追加されたようなノリ。SimoorgとBurrowはOSS化されていますか。

リンク

Project Falco: Decoupling Switching Hardware and Software

kimutansk 2016/02/03

既成スイッチのファームにバグあって情報開示されないので自分達でODMで構成したスイッチにファーム入れて自分達のスイッチを作ったという話。

リンク

Kafka at LinkedIn: Current and Future

The LinkedIn engineering team has developed and built Apache Kafka into a powerful open source solution for managing streams of information. We use Kafka as the messaging backbone that helps the company’s applications work together in a loosely coupled manner. LinkedIn relies heavily on the scalability and reliability of Kafka and a surrounding ecosystem of both open source and internal components

kimutansk 2015/11/25

Kafkaに足りないものとして、セキュリティ、クォータ、信頼性可用性、CoreAPI改善。後はハードの更なる有効活用、低レイテンシストリーム、運用容易性がほしいと。

kafka

リンク

Benchmarking Apache Kafka: 2 Million Writes Per Second (On Three Cheap Machines)

Benchmarking Apache Kafka: 2 Million Writes Per Second (On Three Cheap Machines) I wrote a blog post about how LinkedIn uses Apache Kafka as a central publish-subscribe log for integrating data between applications, stream processing, and Hadoop data ingestion. To actually make this work, though, this "universal log" has to be a cheap abstraction. If you want to use a system as a central data hub

kimutansk 2014/05/28

6コアXeon、32GBメモリ、7200rpmHDD*6のマシン3台で3プロセスの非同期レプリケーション状態で200MB/s近いスループット出ますか。Kafka。かつ遅延は99.9パーセンタイル14ms程と。

リンク

The Log: What every software engineer should know about real-time data's unifying abstraction | LinkedIn Engineering

The Log: What every software engineer should know about real-time data's unifying abstraction I joined LinkedIn about six years ago at a particularly interesting time. We were just beginning to run up against the limits of our monolithic, centralized database and needed to start the transition to a portfolio of specialized distributed systems. This has been an interesting experience: we built, dep

kimutansk 2013/12/27

LinkedInの記事だけあって、Apache Kafka、Storm、Samza等ログストリーム＆ストリーム処理プロダクトについての・・大作記事。しかも最後のリファレンスも充実。素晴らしい。

リンク

Auto-Scaling with Apache Helix and Apache YARN

At LinkedIn, many individual services integrate together to deliver a reliable and consistent end-user experience. Although each service handles a specialized set of responsibilities, they all share a common set of required features such as load-balancing, dynamic reconfiguration, health monitoring, and fault-detection. Last year we introduced Apache Helix, an open-source generic cluster managemen

kimutansk 2013/10/14

Apache HelixとApache YARNを用いた自動スケーリングシステムについての記事。Linkedinだとこの組み合わせなわけですか。

リンク

Optimizing Linux Memory Management for Low-latency / High-throughput Databases

Optimizing Linux Memory Management for Low-latency / High-throughput Databases Co-author: Cuong Tran Table of Contents Introduction Setting up the context Reproducing and understanding Linux's zone reclaim behavior NUMA memory rebalancing also triggers direct page scans Lessons learned Introduction GraphDB is the storage layer of LinkedIn's real-time distributed social graph service. Our service h

kimutansk 2013/10/09

Linuxの最適化やデフォルト設定を使うのではなくて、きちんと自分で事実を確かめてチューニングせよ、ということですかね。NUMAもそう簡単に使いこなせるわけではないと

リンク

SNA Projects Blog : Tech Talk: Michael Deerkoski (Flickr) — “Continuous Deployment at Flickr”

LinkedIn operates the world’s largest professional network with more than 645 million members in over 200 countries and terr itories. This team builds distributed systems that collect, manage and analyze this digital representation of the world's economy, while our AI experts, data scientists and researchers conduct applied research that fuel LinkedIn’s data-driven products and provide insights tha

kimutansk 2013/08/21

LinkedinのNoSQL「Espresso」、2013第２四半期OSS化なのでそろそろ来るんでしょうか。一貫性確保、セカンダリインデックス保持、テキスト検索可、限定トランザクション可と機能は凄そうなんですが

OSS化

リンク

はてなブックマーク

タグ

ブックマーク / engineering.linkedin.com (10)

お知らせ

今週のはてなブックマーク数ランキング（2024年6月第2週）

月間はてなブックマーク数ランキング（2024年5月）

今週のはてなブックマーク数ランキング（2024年6月第1週）

公式Twitter

キーボードショートカット一覧

はてなブックマーク

公式Twitter

はてなのサービス