Quantcast
Channel: Data Science – Research
Browsing latest articles
Browse All 11 View Live

Image may be NSFW.
Clik here to view.

HLLs and Polluted Registers

Introduction It’s worth thinking about how things can go wrong, and what the implications of such occurrences might be. In this post, I’ll be taking a look at the HyperLogLog (HLL) algorithm for...

View Article



Image may be NSFW.
Clik here to view.

HyperLogLog++: Google’s Take On Engineering HLL

Matt Abrams recently pointed me to Google’s excellent paper “HyperLogLog in Practice: Algorithmic Engineering of a State of The Art Cardinality Estimation Algorithm” [UPDATE: changed the link to the...

View Article

Image may be NSFW.
Clik here to view.

Doubling the Size of an HLL Dynamically – Unions

Author’s Note: This post is related to a few previous posts dealing with the HyperLogLog algorithm.  See Matt’s overview of the algorithm, and see this post for an overview of “folding” or shrinking...

View Article

Image may be NSFW.
Clik here to view.

Doubling the Size of an HLL Dynamically – Extra Bits…

Author’s Note: This post is related to a few previous posts on the HyperLogLog algorithm.  See Matt’s overview of the algorithm, and see this for an overview of “folding” or shrinking HLLs in order to...

View Article

Image may be NSFW.
Clik here to view.

Foundation Capital and Aggregate Knowledge Sponsor Streaming/Sketching...

We, along with our friends at Foundation Capital, are pleased to announce a 1 day mini-conference on streaming and sketching algorithms in Big Data.  We have gathered an amazing group of speakers from...

View Article


Image may be NSFW.
Clik here to view.

Data Science Summit – Update

I don’t think I’m going out on a limb saying that our conference last week was a success. Thanks to everyone who attended and thanks again to all the speakers. Muthu actually beat us to it and wrote up...

View Article

Image may be NSFW.
Clik here to view.

Sketch of the Day: Frugal Streaming

We are always worried about the size of our sketches. At AK we like to count stuff, and if you can count stuff with smaller sketches then you can count more stuff! We constantly have conversations...

View Article

Image may be NSFW.
Clik here to view.

Differential Privacy: The Basics

Hi! I’m Anthony Tockar. I am a masters student at Northwestern University and have been working with the Science team for the summer. This is the first of two posts I will contribute on the topic of...

View Article


Image may be NSFW.
Clik here to view.

Riding with the Stars: Passenger Privacy in the NYC Taxicab Dataset

In my previous post, Differential Privacy: The Basics, I provided an introduction to differential privacy by exploring its definition and discussing its relevance in the broader context of public data...

View Article


Image may be NSFW.
Clik here to view.

Neustar at the 2015 Grace Hopper Celebration of Women in Computing

Author’s Note: Hello readers! I’m Julie Hollek, and I am a data scientist at Neustar, where I focus on understanding questions around identity. Before joining Neustar, I received my PhD in astronomy...

View Article
Browsing latest articles
Browse All 11 View Live




Latest Images