Wednesday, November 14, 2012

Statistics - Who Pays to Gather the Numbers?

Recently, Nate Silver became a darling of political media. He not only predicted who would win the Presidential election, he even predicted each individual state's vote within 2%, and who would win every single Senate race except North Dakota's - 32 for 33.

This has lead to sour grapes for other polls, in particular Gallup, a national poll Nate Silver briefly called out as being the worst poll of the 50 or so he tracks. In Gallup's rebuke of Nate Silver, they make an interesting complaint: They spend the money to gather polling data, and if they wanted to improve their polls, they have to spend more. Yet they are not the darling of the polling industry - nor even is the most accurate of the 50 polls Nate tracked. Instead, Nate is. And what did Nate or his 538 organization spend on polling? $0.

So, Gallup gripes, perhaps Gallup should get out of polling and get into the business Nate is in. It's much cheaper, and might get you a lot more readers - that afterall is where the money's at. Unfortunately Gallup ends with the typical gunshot into the air that businesses losing to the internet fire off: Maybe the government should pay for our business with taxes, because we can't figure out how to make money off of it. Some may recall the music industry proposed an internet tax when they realized they couldn't get $14 for a physical CD anymore. Both ideas were equally stupid.

But this is not the first time an issue like this has come up, where one group assembles original data at great expense, and another, seeing it online, gathers it for free and makes use of it without paying a dime. Google's entire business model is doing just that, for which they make billions a year. In a very clear-cut case, a little company, "Mocality," put together a website and incentives program to essentially build a Yelp for Kenya. Prior to this initiative, essentially 0 Kenyan businesses were online. Now with the work of some strapping Kenyan IT personnel and over $100,000 in payments to Kenyans who contributed to this online database (a LOT of money in Kenya!), they have a significant number of Kenyan businesses in their database.

So, Google, looking to also have Kenya businesses in their database sees that, and scoops it up without paying a dime. This pissed off Mocality. And you can't blame them - what they spent $100k in payments to users and more in website development and fine-tuning, Google is walking away with for free.

Yet if you look at this from any other perspective, the information is online. Too bad, right? If you didn't want people to have access to it, you shouldn't've put it online - right? If we take this to be true - and like it or not, it is true in so far as if you post information of value, someone will most certainly take it for free regardless of your views on things - then what incentive is there to gather this sort of original data? Essentially, what is the business model that allows you to spend the time, effort, and money on gathering data that really deserves to be online, without just looking like an idiot for helping a lower-cost competitor get rolling?

For the time being this situation does appear unjust. But perhaps once the best business model to capitalize on these efforts is clear, the injustice will feel irrelevant. So what might that business model be?

When Google created a search that was substantially better than other searches, one irony was that those search results could easily be stolen. And one (perhaps ironic) response they took to that theft was to add behavior monitoring to their software that watches for what appears to be someone trawling their results to steal them. So one element is to have not just data, but a lot of data - too much for any typical user to ever view all of in one go, so that if someone tries - you can catch them before they get everything as an obvious outlier in usage data. Another is to do a good job of writing that monitoring and blocking software.

One more is to continuously gather that data so that even if someone steals the data in small sips, they're always too far behind you, and out of date, to make you a legitimate competitor.

It appears Gallup's (almost) observation is correct: That you need to both provide original data and analysis to make money. People read or use your product for the promised results you can give them, less than they use it for raw data. In a sense, providing raw data is a much lower value service - one that begs other businesses to come along and analyze, and deliver more promising answers with it. Making a promise to provide specific insights, recommendations or other actionable information can be worth quite a bit of money. Gathering data, if that's all you intend to do, may be a better fit for keeping in a private database you charge for access to, with some kind of downstream "per viewer" fee for what your customers build on top of it. A better economy of ideas in its purest form, but a business model rife for theft, sadly.

Finally, any business looking to do original data collection is going to make mistakes in their business model, so they're best learning how to make news - after all in Mocality's case, all it took was public embarrassment on places like Slashdot and Reddit (2 news sources Google employees read frequently) to resolve their Google problem. That won't prevent other competitors from swooping in on their data - but it may buy them some time to refine their business model so it doesn't crush Mocality when would-be competitors try.


Monday, October 15, 2012

Writing HTML/CSS on a Mac

HTML/CSS beginners often ask me what editor to use, and I ask them whether they're on a Mac or PC - they're invariably on a Mac and I have zero answers. In the past I've told people to use Smultron, or as my sister called it, "The Tomato?" I have no idea if that was a good answer or not. In fact I'm not even certain the icon was intended to be a tomato - it may have been a raspberry.

In the future I can now abdicate to this Reddit post.

What HTML/CSS editor for Mac should I use...?

If this Reddit post is filled with garbage and lies I look forward to even better answers.

Tuesday, September 4, 2012

HP m6 1045dx Review

The HP m6 1045dx was recently on sale at Best Buy for $699 - that's about $770 with our massive 10% sales tax here in California. I mention the cost because for some reason you can't buy this machine from HP itself for less than $875, and that's after a sale they're running - they normally want to sell it to you for $1000.

I couldn't find any useful reviews of this machine, so, even though I don't really review hardware: Here's a review.

The HP m6 1045dx is part of the HP mX 10X5 series, where they insert a number in place of those X's to mean "better than that other thing with the lower number." There seems to be no other formula to these numbers than that relative comparison, which is annoying if you want to compare to say, their other laptop lines. I looked over the variations in this line and a couple other lines and concluded this particular model is not only one of the easiest to find in stores, it also happens to be one of the strongest out of that selection. Here's the specs:

Aluminum frame
Intel Core i5-3210M 2.5Ghz
8gb RAM (expandable to 16gb by complete replacement of RAM)
Intel GPU (Intel HD 4000)
15.6in 1366x768 display
Backlit keyboard (can be switched off)
750gb 5400rpm HD
DVD drive/burner
3 USB3 ports

"Beats Audio" meaning it has a subwoofer in the bottom
Multitouch touchpad with configurable usage
Fingerprint scanner

Wifi, Bluetooth, Widi, HDMI, VGA, 1000gb ethernet, headphone output, mic input

Overall

A laptop this cheap with multitouch, an aluminum frame, and strong performance is a great deal. There are a lot of machines out now that cost more that have fragile plastic frames and no multitouch, and although I'm sure some shoppers fall for those still, they're decidedly behind the times. It performs well, and also gets about 5 hours of battery life, which is better than the 4 you'll get from your average buy. The screen dims and brightens across a good range, and the speakers are surprisingly strong at loud volumes in case you need to watch a quiet Hulu show across from noisy neighbors.

CPU - Performance

The CPU is a strong choice even though HP bills it as their weakest option, offering upgrades online to one of 2 Core i7s. The included i5 outperforms many i7s, and yet uses only 35W of power, unlike those i7s you can upgrade to. That's important in a laptop because it means it won't just wipe out your battery in a couple of hours. It also can tolerate up to 105C, which means the system can run the fan less even on a hot day under heavy usage. 105C isn't high for a CPU although many older CPUs have an upper bound of 80C, meaning a lot more fan usage.
http://www.notebookcheck.net/Intel-Core-i5-3210M-Notebook-Processor.74458.0.html

Intel often points out the CPU has "up to 3.1Ghz Turbo Boost," which means that it can ramp a single core up to 3.1Ghz briefly, let that core get absurdly hot, then move the application you're using over to the other core while the first one sleeps and cools off. It's actually a pretty cool feature because most applications you'll run are still meant for single-core CPUs even though you really can't buy a single core CPU anymore. This lets you sort of pretend you've got a single super fast CPU rather than 2 middling ones.

RAM - Memory

The machine comes with 8gb of RAM. It happens to be DDR3, but that doesn't really matter other than to say it's the correct RAM for the CPU it comes with. The RAM is 2 4gb DIMMs, so it occupies both RAM slots in the machine. The machine supports up to 16gb of RAM, so if you were to upgrade you would need to go buy 2 new 8gb RAM DIMMs, and sell or recycle your old 4gb DIMMs. I personally use a laptop with 8gb of RAM for development on Windows 7 and have been able to run with my swap file completely turned off for years, so I'm going to pull a Bill Gates here and say: the included 8gb of RAM is probably all you'll ever need... for the next 5 years.

Display

The display ramps from a good dim minimum brightness level that's pretty good for usage in a lights-out setting, to a max brightness level that's good enough in sunlight. If you're looking for the best laptop to use in the park, this probably isn't it. That said, it's no worse than most laptops I've used in this category. The color and resolution are about average compared to other laptops.

Battery

I got 5 hours of battery life in my usage, which is on the standard 6 cell battery that comes with the machine. You can upgrade to a 9 cell that would presumably get you closer to 8 hours. That battery life was using the default "HP Recommended" power profile, which from what I can tell is just the default Windows 7 "Balanced" profile with a pointless rename. In 2 hours of heavy usage the laptop wasn't noticeably hot, although it was warmer on the left side of the keyboard itself than the right.

Presumably you'd get longer battery life like HP's claim of 6 hours if you turned on the Power Saver profile instead, and much worse battery life if you were running something that pegged the CPU, like Virtual PC/Windows XP Mode. One complaint I have here is that HP has modified Windows 7 to always show the "Power Saver" profile in the list of power profiles when you click the battery icon on the right end of the taskbar, and they've limited the number of profiles shown to 2. That means if you add one of your own like say, "Stay Awake" for watching movies, you'll have to click several more times to dig around and select it every time you want to switch to it.

Storage and Boot Time

The clear weak point with this machine is its old school 5400rpm 750gb drive. It's both smaller and slower than nearly any hard drive you can buy today. Boot time is as slow as you'd expect with this kind of drive - about a minute. You should assume starting up drive-bound applications like Photoshop will be likewise slow to start. You can upgrade the drive if these metrics matter to you; we preferred to save the ~$300.

Software

HP limits their installation of garbage on this machine to just a few minor annoyances. There's a nag to get you to register the PC with them that takes a little hunting in msconfig to turn off. "HP SimplePass" cannot be disabled and occupies significant screen real estate in Internet Explorer until you uninstall it completely. Note that uninstalling this scan-finger-to-sign-in-to-websites program is not related to scanning your fingerprint to sign in to the laptop - you can still sign in to Windows without SimplePass installed. Finally, there are a few sort of Adware programs installed by default, including "Blio," whatever that is, "HP Quicklaunch," "HP Launchbox," and Norton Antivirus. All of these were clearly named in the Program Install/Uninstall section of Control Panel and easily removed.

On antivirus, I recommend installing the free Microsoft Security Essentials, free anti-virus from Microsoft, rather than enduring the performance burden and subscription nags of Norton Antivirus. C'mon - we both know you've used a PC where Norton or McAfee has just expired for years and no one did anything about it. Anything is more secure than that.

Other Features

The Multitouch touchpad was a welcome add I hadn't seen with Windows machines before; I honestly assumed Apple had patented it. With it you get the obvious like 2-finger scroll, pinch to zoom and rotate. Software support for these functions is somewhat limited, and 2-finger scroll is not quite as intuitive as it might be: To get 2-finger scroll to work, you need to select the window you want to scroll, have the mouse cursor currently over it, and... get a little lucky. In our usage sometimes we just could not get a window to scroll that would scroll fine at other times. Fortunately webpages worked fine most of the time. Random Windows Setting dialogs 2-finger scrolled least effectively, about 1 in every 2 attempts. Clicking to select the window sometimes worked in the latter examples, but sometimes there was absolutely nothing we could do to get it to work. You can turn this feature off, as well as tapping to click (something I despise). You can also do some weird things. You can make 3 finger dragging change desktops. You can tap with 2 fingers twice quickly to turn the touchpad off entirely, to avoid interrupting typing something long like a paper. And the oddest, you can configure tapping once with 4 fingers open a single application. I can't really imagine a scenario where I want exactly one magical 4 finger shortcut, but hey, it's there if you can.

I was also surprised to find the aluminum frame on this machine. About half the mid-priced laptops available at Best Buy had some variation on an aluminum frame, and I appreciate both how sturdy that is and how much cooler to the touch that tends to be.

The backlit keyboard was a thrill in low light - I'm never buying a laptop (or keyboard attachment for a tablet) without a backlit keyboard after using this one. I also like that you can turn it off with a simple keypress rather than counting on a fiddly light meter that's inevitably going to get it wrong.

HP seems to have borrowed/stolen from Apple in more ways than one, with the keyboard diverging from usual Windows usage. The Function keys - F1-F12 - don't act like function keys by default. Instead, pressing the Volume Down/F3 key acts like a Volume Down key first, and like F3 (Find in Windows) only if you hold down the Fn key. I personally like this shift towards the Mac, but people who are used to F3 doing something different may not. The backlight toggle, display dim/brighten, and mute all work this same way.

The keyboard also has a numpad, which may come to the excitement of some geeks and those with a background in data entry. The arrow keys on the other hand are half-sized and a bit hard to find on the keyboard. If you're a gamer you may be bothered by these arrow keys.

Then again, if you're a gamer you absolutely should not be buying something with an Intel "HD" GPU built-in; Intel graphics are garbage by today's 3D standards. That said, Google Earth renders just fine on this limited GPU, and of course it does use a lot less power than a mobile 3D card would. The only major drawback in simple 3D applications like Google Earth is the lack of anti-aliasing due to a weird and confused stance the CEO has taken against it. Anti-aliasing prevents jagged lines when looking at a diagonal or curved edge of say a mountain or skyline. You'll be getting those jagged lines in 3D apps on this laptop. If you care at all.

I was pleased to see USB3 finally become a standard offering on midrange laptops. I'm also glad someone is making a good loud audio option available on midrange laptops.

Tuesday, July 10, 2012

jQuery.each()

jQuery.each() has many forms and functions, and the documentation is lacking. Here's what the documentation has to say about it:

http://api.jquery.com/each/

.each(arrayfunction(index, Element))

function(index, Element)   A function to execute for each matched element.

As it turns out this is a significant oversimplification of this function. It varies in 3 ways:
  1. Whether it's called on the global jQuery object like $.each(collectionfn), or on a jQuery instance like $('div').each(fn).
  2. When used in the global form, whether it's called on an array or an object.
  3. Whether it's called with any additional arguments.
Before I introduce any examples let's use a simple jsfiddle for testing these variations on .each():

The code is pretty simple. There's a Run button that can call a simple say(s) function that appends whatever HTML is passed in to the div id=output below the button. Now we can inspect the code, click Run, and see the results. The code I'll be demonstrating will always appear under the // Demo code comment.

$.each(array, fn)

Let's begin with the most common usage - calling the global version on an array:

var array ['hi''bye''hello?'];
$.each(arrayfunction(ival{
    say(': ' val);
});

This is called on the global jQuery object rather than an instance. The arguments passed to the function are the index in the array, and then the value at that index. One important subtlety here is that this has been coerced into also being the value at that index of the array. In other words this code can be simplified to:

var array ['hi''bye''hello?'];
$.each(arrayfunction(i{
    say(': ' this);
});

A minor variation, but the reduced number of params may minify slightly smaller. If you were only using the values in the array and not the indices, you could pass a function with no params whatsoever:

var array ['hi''bye''hello?'];
$.each(arrayfunction({
    say(this);
});

$('tag').each(fn)

Calling .each() on a jQuery instance object is identical to calling it on an array, except you pass the function as the first argument this time, and the subtlety of what this refers to has changed - it now refers to the tag in the list of tags jQuery has selected. Not a jQuery object wrapping that tag, but the actual browser-specific Element object. The value object passed earlier is now that same tag object.

$('span').each(function(itag{
    say(': ' tag.innerHTML);
});

This performs identically:
$('span').each(function(i{
    say(': ' this.innerHTML);
});

$.each(obj, fn)

Looping through an object varies significantly from the array format. The function you pass in is now of the format:

function(pv{
}

Where p is the property name (a string), and v is the value of that property in the object. For example:

var obj {
    a'hi',
    b'bye',
    c'hello?'
};
$.each(objfunction(pv{
    say(': ' v);
});

Here we build a simple object to iterate through, and print out its properties by name and value. Notice that there's no way to get an index from this function anymore. If you do want to keep track of how many items have been counted, you'll need to tack that on yourself via a closure:

var obj {
    a'hi',
    b'bye',
    c'hello?'
};
var 0;
$.each(objfunction(pv{
    say((i++',' ': ' v);
});

And as you might expect, this is the value of the property currently selected, meaning the following code is identical to /6:

var obj {
    a'hi',
    b'bye',
    c'hello?'
};
$.each(objfunction(p{
    say(': ' this);
});

Caveats of using this

jQuery's reuse of the this object in Javascript comes with some built-in dangers due to the Javascript language itself. Suppose you had an array of boolean values like:

var array [truefalsetrue];

And decided to loop over it checking whether they were true or false. What would you expect?

If you notice, the values printed out this time are identical - they all come out to false! Why? Because in Javascript converting the special variable this to boolean always converts it to true, even if the underlying object is itself boolean and set to false!

This is not an easy rule to remember, and jQuery's coercion of the value of this in $.each() encourages making this mistake. Given that the code savings of using this are limited, in any environment with multiple coders or even where you simply worry about making mistakes, it might be smart to forbid the use of the this keyword inside $.each() calls and always use the parameters passed to the function. This prevents both this misunderstanding, and people expecting some other object in the place of this (like the array being looped through, or what this refers to outside the closure created by this function).

Related Functions

You occasionally want to get a specific tag from a list of tags selected by jQuery rather than looping through the list. If you want that tag as its browser-specific Element object, you can access it like:

$('span')[2]

Which returns the 3rd tag in the list jQuery has selected.

If you rather have the jQuery-wrapped tag, you use the strangely-named .eq method instead:

$('span').eq(2)

Alternative to Plain Loop

It's common to use $.each() in place of a standard for loop:

for(var 0array.lengthi++{
    say(': ' array[i]);
}

$.each(arrayfunction(ival{
    say(': ' val);
});

The code savings here are most significant when you use the this form and aren't using the index in the loop:

for(var 0array.lengthi++{
    say(array[i]);
}

$.each(arrayfunction({
    say(this);
});

The problem with using $.each() in place of a regular for loop is that you can no longer use break to break out of the loop early. For this you must use an alternative syntax - return false:

for(var 0array.lengthi++{
    say(array[i]);
    if (array[i== 'bye')
        break;
}

$.each(arrayfunction({
    say(this);
    if (this == 'bye')
        return false;
});

But, there's still one thing a regular for loop can do that $.each() can't, and that's return from the entire function early - for example if you're searching an array for a specific value:

function indexOf(arrayval{
    for(var array.length 1>= 0i--{
        if (array[i== val)
            return i;
    }
    return i;
}

Unfortunately there isn't really an equivalent to this in $.each() - at best you can inject a flag into the closure and check it to decide whether to return early:


Here's the same example where the loop is allowed to complete:


$.each(array, fn, args)

jQuery maintains a number of APIs it uses internally but asks that you don't use externally. So, if it saves you some code, there's no harm in leveraging them, so long as you write a unit test that validates that internal API still works, and check it whenever you update jQuery.

With that warning, the internal version of $.each() is used to call a function with a fixed set of arguments on all the items in an array. Here's a silly example:

var array ['hi''bye''hello?'];
$.each(arraysayItem['monkeys']);

You can probably infer how jQuery makes use of this - any time it needs to loop through a series of tags or objects and set the same property on all of them, it can use this shorthand. You can too:

function setBgColor(color{
    this.style.backgroundColor color;
}
$('p').each(setBgColor['#f00']);

This can also be useful if your function is externally defined, and you want to pass it some external context information (rather than creating a closure). Consider these 2 examples of building an object from an array:


In the latter example, enableFlag is defined outside our code, which would have required creating an extra closure around it to pass in the hashset we want to toggle these flags on. We can avoid a little code bloat with this usage. You can leverage this a bit farther:


The Code

You can see the above cases pretty easily in the main code for $.each() (this is taken from jQuery 1.7.2).

// args is for internal usage only
eachfunctionobjectcallbackargs {
    var name0,
        length object.length,
        isObj length === undefined || jQuery.isFunctionobject );

    if args {
        if isObj {
            for name in object {
                if callback.applyobjectname ]args === false {
                    break;
                }
            }
        else {
            for length{
                if callback.applyobjecti++ ]args === false {
                    break;
                }
            }
        }

    // A special, fast, case for the most common use of each
    else {
        if isObj {
            for name in object {
                if callback.callobjectname ]nameobjectname === false {
                    break;
                }
            }
        else {
            for length{
                if callback.callobject]iobjecti++ === false {
                    break;
                }
            }
        }
    }

    return object;
},