Quantcast
Channel: Highly Scalable Blog
Browsing latest articles
Browse All 12 View Live

Image may be NSFW.
Clik here to view.

MapReduce Patterns, Algorithms, and Use Cases

In this article I digested a number of MapReduce patterns and algorithms to give a systematic view of the different techniques that can be found on the web or scientific articles. Several practical...

View Article


Image may be NSFW.
Clik here to view.

Tricks with Direct Memory Access in Java

Java was initially designed as a safe, managed environment. Nevertheless, Java HotSpot VM contains a “backdoor” that provides a number of low-level operations to manipulate memory and threads directly....

View Article

Image may be NSFW.
Clik here to view.

NoSQL Data Modeling Techniques

NoSQL databases are often compared by various non-functional criteria, such as scalability, performance, and consistency. This aspect of NoSQL is well-studied both in practice and theory because...

View Article

Image may be NSFW.
Clik here to view.

Hierarchical Navigation and Faceted Search on Top of Oracle Coherence

Some time ago I participated in design of a backend for one large online retailer company. From the business logic point of view, this was a pretty typical eCommerce service for hierarchical and...

View Article

Image may be NSFW.
Clik here to view.

Probabilistic Data Structures for Web Analytics and Data Mining

Statistical analysis and mining of huge multi-terabyte data sets is a common task nowadays, especially in the areas like web analytics and Internet advertising. Analysis of such large data sets often...

View Article


Image may be NSFW.
Clik here to view.

Fast Intersection of Sorted Lists Using SSE Instructions

Intersection of sorted lists is a cornerstone operation in many applications including search engines and databases because indexes are often implemented using different types of sorted structures. At...

View Article

Image may be NSFW.
Clik here to view.

Speeding Up Hadoop Builds Using Distributed Unit Tests

We recently worked with one of the Hadoop vendors on the continuous integration system for Hadoop core and other Hadoop-related projects like Pig, Hive, HBase. One of the challenges we faced was very...

View Article

Image may be NSFW.
Clik here to view.

Distributed Algorithms in NoSQL Databases

Scalability is one of the main drivers of the NoSQL movement. As such, it encompasses distributed system coordination, failover, resource management and many other capabilities. It sounds like a big...

View Article


Image may be NSFW.
Clik here to view.

In-Stream Big Data Processing

The shortcomings and drawbacks of batch-oriented data processing were widely recognized by the Big Data community quite a long time ago. It became clear that real-time query processing and in-stream...

View Article


Image may be NSFW.
Clik here to view.

Data Mining Problems in Retail

Retail is one of the most important business domains for data science and data mining applications because of its prolific data and numerous optimization problems such as optimal prices, discounts,...

View Article
Browsing latest articles
Browse All 12 View Live