The Checkmate Scientist: July 2014

Tuesday, July 8, 2014

Five Tips on Doing Business in Silicon Valley. Actually, Five Tips on Doing Business Anywhere.

The folks at MaRS just released this little video highlighting five tips for doing business in Silicon Valley. The advice is applicable anywhere.

1. To succeed, first understand the area’s history.

Whenever you're working with people outside of your area, be it geographical or outside of your area of expertise, you need to be able to relate to where they came from. How do their values differ from yours? What is important to them? Is there something about that location or field that attracts certain type of people, or encourages a particular kind of behaviour (think entrepreneurship, research excellence, etc.)?

2. Spend your time there legally and intelligently.

Plan ahead to get the biggest return on your time investment. What

3. Be open to collaboration.

Share ideas with your potential partners. Help them develop their ideas and they will help do the same for yours.

I've written before about how operating in stealth mode stifles research projects and exposes scientists to several traps. Collaboration takes effort, but can pay off in spades when you find good partners, especially in high risk, pre-commercial (i.e. basic) research.

4. Steer clear of the myths about Silicon Valley.

Not sure I fully agree with this one.

Myths exist about every place and every institution. However, there are bad myths and good myths.

Bad ones will usually serve to drive you to inaction. They're the ones about cutthroat competition, backstabbing, politics, and favoritism.

Good myths, on the contrary, will encourage you to make connections and build on your ideas. The good myths may turn out to be false, but at least they've led you to break that inertia of doing nothing.

5. Recognize that San Francisco is not Silicon Valley.

Aron Solomon's point is that they may be a 45 minute car ride away, but they are not the same kind of place. The same is true about the many organizations that may exist in a technology cluster, even if they're within a 45 minute walk.

Universities are different from research institutes, and independent research institutes are different from those associated with hospitals.

A Big Cash Prize is a Great Motivator

A business plan competition for 'young' (under 36) scientists by Oxford Biotech Roundable and GSK figures out how to motivate scientists to come out to bat:

Our fundamental challenge was to generate enthusiasm for a biotech business plan competition and get people excited about entrepreneurship in a sector and region not known for its risk-taking culture. But we also knew that the caliber of researchers and students we sought to engage would need an attractive value proposition to incentivize them to invest their time and energy. In this respect, the grand prize (£100,000 or about $180,000) provided an attractive reason for entrants to engage with the competition rather than pursue more established career trajectories.

The rest of the article includes many other bits of useful knowledge, like the main obstacles young researchers face when considering entrepreneurship (think networks and poor mentors), but the importance of setting the value of prize, grant, or fellowship is clear: If you want quality applicants, the chance of getting a prize must be worthwhile.

Thursday, July 3, 2014

More Data Doesn't Mean More Interesting Data

David Beer, at Adaptive Computing, writes:

One of the keys to winning at Big Data will be ignoring the noise. As the amount of data increases exponentially, the amount of interesting data doesn’t.

He describes the problem of predicting what online video a user is going watch next, and how an analysis can quickly run the number of predictions up into thousands of possible 'next steps' to evaluate.

These are then compared with all of the other empirical data from all other customers to determine the likelihood that you might also want to watch the sequel, other work by the director, other work from the stars in the movie, things from the same genre, etc. As I perform these calculations, how much data should be ignored? How many people aren’t using the multiple user profiles and therefore don’t represent what one person’s interests might be? How many data points aren’t related to other data points and therefore shouldn’t be evaluated as a valid permutation the same as another point?

Thes points are probably the biggest value that an experienced scientist can provide to the scale of these data problems. This kind of person has at least several years of work experience in a hypothesis driven research environment and is able to solve problems using incomplete data. They probably have a PhD to go with that quantitative experience.

The first point, working in a hypothesis driven environment, demonstrates that that person should be able to devise a strategy to prove/disprove the hypothesis (I hypothesize that this customer will watch video Y after video X), and figure out how to do that efficiently without getting stuck in the weeds, or the irrelevant data Beer describes. Unfortunately, it does take some skill to interview a person before you determine that they can actually do this, especially there are differences between yourself and the interviewee.

The second point, being able to use incomplete data, is something seems to come from experience. Most people trained in research fields start off trying to collect the most data possible, and don't make a decision until 'more data is collected'. It's easy to get stuck in a data collection rut, but eventually most people realize that it's actually OK to come to a conclusion before seeing the whole picture.

Collecting a lot of extra data costs time, resources, and puts a demand on your attention span until that elusive point of having 'enough data' is reached. Sometimes that data is worth it, but many times it's not. It just sits there because no one has time to do anything with it, so the data remains idle and risks becoming stale. Unless it's actually your job to do so, be careful of making data for the sake of making data.

ASIDE: One of the neatest things I find about the customer analytics field (as compared with genomics or computational biology) is that data is basically being generated by the study population itself, for what is essentially free.

Tuesday, July 1, 2014

Snowflakes Visualize Wind Turbine Effects on Airflow

Oh yeah, by the way, 'Here we use snowflakes from a winter snowstorm as flow tracers to obtain velocity fields downwind of a 2.5-MW wind turbine'

say the authors of a really neat article at Nature Communications.

Checkmate Scientist is Closing Comments

Regular readers (there are about 200) may be sad to find out that I'll be closing comments going forward.

I've become much more busy over the last six months (as you may have noticed by the decrease in posts) and I've unfortunately been moderating an increasing amount of comments that are clearly from spammers. I'd rather spend time reading and writing than deleting spam, and I think you'll agree.

As always, you can send in comments to comments@checkmatescientist.net or via Twitter to @pmkrzyzanowski and I'll do my best to respond.