Tuesday, August 28, 2012

Twitter Joins Linux

Twitter uses and builds a fair amount of open-source software, so it wasn’t too shocking when we read in our inboxes this morning that the social media startup has joined the Linux Foundation.

“Not only is Twitter built on Linux, but open source software is core to its technology strategy,” said a Linux Foundation rep to VentureBeat via email.
“It’s investing even more in the platform now as the company evolves and positions itself for the future. Linux has become even more dominant among web-based companies as the ‘hacker way’ has become pervasive among the newest generation of startups.”
Ah, yes, the Hacker Way. Or should we say, the “Hacker Way.”
Espousing open-source ideals, at least in spirit, has become increasingly common among web startups, especially in the Bay Area. Facebook CEO Mark Zuckerberg famously wrote a “hacker way” mini-treatise into his company’s SEC IPO filing.
But in fact, while companies like Facebook and Twitter rely on open-source technologies and programming languages to get their various jobs done, their businesses are conceptually based on proprietary software, not open-source software.
As famous hacker Eric “esr” Raymond pointed out in a recent interview with VentureBeat, the true hacker way means “to give control to the individual, to respect his or her privacy, to create tools for autonomy and liberty, and to encourage creative re-use of software” — only parts of which are built into Twitter’s products.
While Twitter has open-sourced some of its software — a load-balancer called Iago and a design framework called Bootstrap, for example — vast expanses of Twitter code remain under lock and key, and the company’s recent and coming API changes means it’s getting farther away from anyone’s definition of free and open-source software.
This is a problem. Specifically, it’s a recruitment problem.
Twitter needs to continue to pull in the best, brightest, neckbeardiest developers the world can offer, and it can’t do so without some commitment to open-source communities. The company actually recently hosted an open-source event with thinly veiled recruitment mechanisms built in just for this reason: great developers and open-source software go together like peanut butter and jelly, and the more you can convince a great developer that your company believes in open-source, the more likely you are to recruit great developers in a highly competitive hiring environment.
All that being said, Twitter does have a vested interest in helping to advance the cause of Linux in particular, and some participation in open-source communities is better than none at all.
As a web-based business, Twitter, like every other web service, is supported by tens of thousands of Linux servers. In a statement on today’s news, the company said it intends to partner with the Linux Foundation to promote and protect Linux, the open-source operating system.
“Linux and its capability to be heavily tweaked is fundamental to our technology infrastructure,” said Twitter open-source manager Chris Aniszczyk in the statement.
“By joining the Linux Foundation, we can support an organization that is important to us and collaborate with a community that is advancing Linux as fast as we are improving Twitter.”

Friday, August 24, 2012

OpenCL SDK early released by Adapteva

Adapteva, a privately-held semiconductor technology start up, today announced that it is providing an early access release of an OpenCL SDK for the Epiphany multicore architecture. The OpenCL implementation was completed together with Brown Deer Technology, leading innovator in open-source heterogeneous computing.

OpenCLTM is an open, royalty-free standard for cross-platform, parallel programming that is now reaching widespread adoption in servers and handheld/embedded devices. OpenCL (Open Computing Language) provides a portable API for accessing the compute capabilities of a platform, accelerating performance in a wide spectrum of applications in numerous market categories from gaming and entertainment to scientific and medical software. With the OpenCL SDK, Epiphany programmers will be able to easily accelerate compute intensive tasks across an arbitrary number of cores on Epiphany based accelerator solutions.

“Adapteva’s Epiphany multicore architecture scales to 1000’s of cores on a single chip. Such massive parallelism requires a battle tested programming model that scales well. We chose OpenCL because it is an open standard with broad industry support and because it fits perfectly with our approach to heterogeneous computing”, said Andreas Olofsson, CEO at Adapteva. “We were very fortunate to be able to leverage the COPRTHR OpenCL implementation developed by Brown Deer Technology for ARM and x86 processors. The speed by which BDT ported its COPRTHR implementation to the Epiphany architecture and the quality of the results were simply outstanding and speaks volumes about the level of innovation and expertise at BDT and the maturity and flexibility of the Epiphany architecture.”

“Adapteva has delivered an architecture that supports massive on-chip parallelism with impressive power efficiency. OpenCL provides the perfect foundation for such a processor," said Dr. David Richie, President at Brown Deer Technology. "We leveraged this API in an SDK that provides programmers with a clear path for code development. Programmers will find their parallel algorithms mapping naturally to the Epiphany architecture. The chip is designed for massively parallel algorithms. Extracting performance from the architecture becomes relatively easy as a result of this design.”

Target Markets and Availability

The Epiphany OpenCL SDK is currently in Beta release and available to early access partners.
Coding examples and white papers will be published on Adapteva’s corporate site in the coming month.

Thursday, August 23, 2012

Typing.io Lets You Practise Typing In Programming Environments

Programming languages are difficult to wrap your head around at first; you need to train your muscle memory to insert different characters after typing lines. Typing.io is a tool to help budding programmers practise and become more efficient at coding.

Typing.io is not meant as a tool to teach you programming. It’s meant solely as a way to practise typing in different open-source coding environments. This includes Javascript, Ruby on Rails, Perl, C, among others. When you load one of the lessons, you simply need to type over the text on the screen. When you make an error, the text will turn red. If you’re new to programming and need help practising your skills, Typing.io is a great place to start.

Tuesday, August 21, 2012

Hadoop gets a Real-Time Processing by Open Source vets

Nodeable solves real time Big Data issues

Big Data is certainly on a lot of people's lips these days. There is no doubt that we are certainly generating lots of data. Analyzing that data and making it useful is fueling literally millions of dollars of investment in companies around Hadoop, NoSQL, etc. One area where Big Data has some challenges is real-time analysis. With all of that data, analyzing in real time to get actionable intelligence into the hands of users is a big challenge. That is the the challenge that Nodeable is seeking to tackle.

Nodeable is led by a couple of open source veterans. Dave Rosenberg, formerly of Mule Source among a few other open source projects, is the CEO of Nodeable. With him are several folks who have worked with him in his previous open source companies. Additionally, Matt Assay, another veteran open source company builder, is on board at Nodeable as well.

I had a chance to sit down with Dave and talk about what he and his team are doing with Nodeable. You can listen in on our 15 or so minute conversation below. Let me warn you, the audio is a bit uneven at some points, but it isn't too bad and I think the quality of the conversation is well worth the problem with the quality of the audio.

The Nodeable team is using an open source program called Storm, which was originally developed by some folks at Twitter. Nodeable is seeking to commercialize this and build on top of it. This is a model that Dave has followed in the past and has lots of experience with.

Nodeable has been kicking around for a while now, but only recently really went public with this model. It is not competitive with Hadoop or other Big Data solutions, rather it brings another needed facet of Big Data to analytics.

So have a listen to Dave and check out a new and different Big Data solution coming to market.

Open Source DAM Software's 1.5 version releases by Razuna

The newest version of the open source digital asset management (DAM) platform Razuna is now available.

Razuna ApS announced late last week the availability of version 1.5 of the platform, which can be downloaded, used via a hosted service or run on a dedicated cloud server.

Customization, Rendering Farm
Enhancements in the new release include an option to fully customize Razuna, an option to log in using social media username and password via the integrated Janrain plugin, a new Rendering Farm functionality, and major updates to search and the overall look and feel.

The Rendering Farm distributes the job of encoding many files to other servers, whether a dedicated one in-house or one in the cloud. Customization options include the ability to modify tabs, dialogs or look and feel, and the company said that the new caching system “dramatically improves” overall performance and supports such caching engines as Memcached and MongoDB.

Razuna now supports scheduled backups of assets and data within the platform, as well as the ability to export metadata to a spreadsheet.

There’s also additional support for cloud storage, such as Amazon S3, Nirvanix or Eucalyptus, and there’s a new version of the application programming interface (API), which facilitates integration into an organizational environment.

Partner Program Overhaul
CTO and Razuna founder Nitai Aventagiato told news media that, with the latest additions Razuna is “truly an enterprise level digital asset management software” that is available to companies of any size, via the hosted offering.

Later this month, a new partner and OEM program will be launched, which the company said was due to an increasing demand for OEM solutions. CEO Jens Strandbygaard said in a statement that the partner program is undergoing a complete overhaul, adding that the API has allowed software vendors “to embed Razuna deep into their existing technologies” to leverage enterprise-level DAM features.

Strandbygaard also said that resellers and systems integrators will benefit from the new program, “since they will be awarded a higher commission as well as being able to offer our enterprise package to large scale clients.”

The Denmark-based Razuna, founded in 2005, said that its platform is used by more than 5000 businesses worldwide every day.

Saturday, August 18, 2012

GitHub provides companies the tech talent

Because engineers and designers can post their work for all to see, more and more companies are realizing they can see what people can actually do, not just say they can do.

LinkedIn is so 2011.

In the red-hot market for skilled software engineers, companies looking to make great hires are discovering that relying on traditional services that showcase candidates' work histories -- but not their actual work -- is a great way to miss out on the best available talent.

These days, there's a new game in town -- GitHub, a place where hiring managers and recruiters alike are increasingly turning to find not just the potential employees who look best on paper, but the ones that actively (and publicly) demonstrate their capabilities.

Last month, Andreessen Horowitz, one of the hottest venture capital firms in Silicon Valley, put $100 million -- its largest-ever investment -- into GitHub, a company built to facilitate the organization of open-source projects, and that makes money by selling licenses to commercial and corporate users.

Asked why his VC firm ponied up the nine figures, partner Ben Horowitz cited GitHub's dominance these days in being a central repository for open-source code. But he also touted the company's growing role as a place to find top-tier tech talent -- and, more to the point, a preferable alternative to LinkedIn.
"I was talking to my friend [who] runs a tech screening process for looking at engineers," Horowitz told CNET. "I said, 'What do you use for recruiting?' He said GitHub. I said why not LinkedIn? He said, 'why would I look at their resume when I can look at a body of work?' And since he said that to me, I ask everybody [what they use] for engineer recruiting, and everybody uses GitHub. That's a big deal. It means if you're an engineer and you don't use GitHub, you don't exist."

Every engineer and all the code

In Horowitz's view, GitHub has become a place where the hottest engineers are coming together to share their code, and as a result, the service is home to the most important project and collaboration tolls, as well as application life cycle management systems in the business. "They've got the ultimate advantage," Horowitz said, "because they have every engineer and all the code."

This assessment is shared widely throughout the tech industry. From small startups to established, household name powerhouses, GitHub is now seen as the go-to place to spot quality talent. To be sure, there are still engineers who will get Silicon Valley jobs without putting up a GitHub profile, and for whom LinkedIn is still an employment lifeline, and every company's mileage may vary, but a common view is that a developer who has a profile there has an advantage over those who don't.

At many companies, the feeling is that engineers who take the time to develop a GitHub profile and put in the energy to participate actively in the community can be better evaluated in advance than others.
"It's an excellent opportunity to see what they are passionate about, their coding style -- good or bad -- and fun side projects," said Will Young, director of Zappos Labs. "We love when developers see a need and just go ahead and code a solution to share with the community. We are looking for some amazing problem solvers on our team. This is hard to get from an interview or resume. But sometimes, we see someone's GitHub library and think, 'Wow, that is really cool and handy.'"

GitHub itself has been looking to its own service's community for talent, sometimes hiring people that may not present the most stellar picture on paper, but who show off stellar programming skills in real life. "Previously, where you went to college was the be all and end all," said Zach Holman, who evangelizes for GitHub. "The fact that that's not true anymore is fascinating."

Holman also said that internally, GitHub is seeing more and more signs that outside companies are using the service as an initial indicator of whether a potential hire is good or not. "Whether or not somebody has contributed to open source is a good indicator of whether they're a good engineer," he said.
Distinguish themselves

What's particularly attractive to the people who work at GitHub, Holman added, is that the service has become such a great way for developers to distinguish themselves, even as it got its start more as a place where people were sharing their work for no reason other than to do so.
But that sense of selflessly participating in open-source projects is something that is increasingly attractive to hiring companies. "I've heard some of our portfolio companies mention the number of [GitHub] contributions people make and how active they area, and connecting that with their credibility in the community," said Craig Driscoll, recruiting partner at Highland Capital Partners. "There's just the signaling that someone using those types of communities is a general type of qualifier...[especially the] frequency and quality of the contributions."

Indeed, some tech companies are turning to GitHub to identify potential new hires who aren't even actively looking for a new job, and who may not have a resume online. Of course, almost anyone getting hired is still going to have their resume checked out and their education scrutinized, and recruiters are still combing LinkedIn's millions of active users, but their GitHub presence may be the single-most important factor. "We're always looking for people who have forked a lot of [open-source] projects and contributed back into those projects," said Tim Milliron, director of engineering at Twilio, a developer of cloud-based communications apps. "We like people contributing into open source...That carries a lot of [weight with] us."
Milliron said that Twilio has been looking at GitHub as a recruiting platform for more than a year, but that the pace of doing so has accelerated significantly in the last six months. "If we look at 20 people and five have GitHub profiles," he said, "and one has [contributed a lot], then that person tends to bounce to the top of the list."

To be sure, GitHub is hardly the only open-source community that is being looked at by companies searching for technical talent. But in talking to people throughout the technology industry, it appears that GitHub is getting the lion's share of the attention. As Barney Pell, CEO of QuickPay, and the founder of Powerset, whose technology became the basis of Microsoft's Bing, put it, "Online open-source communities like GitHub bring large numbers of...developers together and are thus a natural place for recruiting."

OpenStack which is of Red Hat's distribution raises concerns

There is a growing group of OpenStack distribution providers and some industry watchers who believe fragmentation of the cloud standard could negatively impact cloud software providers and customers.

Red Hat Inc. launched its own distribution of OpenStack this week. Rackspace launched public cloud services based on the open source cloud software stack earlier this month, and other companies with OpenStack offerings on the market include Hewlett-Packard Co., Canonical's Ubuntu, Dell Inc., Piston Cloud Computing and Nebula.

OpenStack isn't just a package like other open source tools, said Lydia Leong, research vice president at Gartner Inc., based in Stamford, Conn. It's an entire framework, which means it will be more difficult to keep the different components in lockstep with one another. Also, any given OpenStack product could theoretically contain whatever combination of those components its creator feels like including, along with proprietary extensions.

Red Hat said any changes it makes to the OpenStack standard will be fed back into the community, but according to Leong, "If your distro doesn't have a lot of changes, then it doesn't have a whole lot of differentiation. But if it does have changes, then you have fragmentation."

Fragmentation makes supporting OpenStack difficult for users and for the management tools ecosystem, Leong said.

For other experts, the growing number of distributions is not a sign of fragmentation, but of growing maturity in the OpenStack market -- especially now that there is a Red Hat version.

"Generally speaking, all these multivendor foundations need one or more distros because there are a class of customers who won't take the product without support," said Mike Norman, analyst with The Virtualization Practice LLC, a virtualization and cloud consultancy based in Wrentham, Mass. "It's a sign of maturity that Red Hat is picking up OpenStack, producing a distro and committing to proper enterprise support and product lifetime."

"The important part is that everyone's sticking to the same [application programming interfaces]," said Chris Perry, cloud architect for DreamHost, a hosting provider based in Brea, Calif. "Even if people implement it in a slightly different way on the back end, the APIs are the same pretty much across the board."

Norman is also encouraged by the changes he's seen in the governance of OpenStack over the last year. Previously, OpenStack was too much under Rackspace's thumb for Norman's liking, but with the establishment of the OpenStack Foundation in April, it has become more of an independent organization, he said.

The Foundation will hold its first election next week as part of "the final critical steps toward giving [OpenStack] a final independent home," said Jonathan Bryce, president of the OpenStack Project Policy Board.

The release, dubbed Folsom, will be released in October.

Right now the open source community is on OpenStack's side, but it's important for the Foundation's participants to start seeing revenue soon, said James Staten, analyst for Forrester Research.

"If OpenStack fails by the fall of this year or the first quarter next year to really start driving revenue, there are a lot of companies that are participating in the OpenStack community in an oftentimes passive way who really need revenue from this, [who] could jump ship," he said. "They have a ticking clock [and] they've really got to move fast."

Thursday, August 16, 2012

New open source Calligra Suite release enhances ODF document support

Calligra is a fairly young open source office suite that shows promise, an expert said

Calligra has published the second stable release of its open source suite that includes word processing, spreadsheets and a sketching program. The new version greatly improves the support of Open Document Format (ODF) documents, said one of its main developers on Tuesday.

The Calligra Suite is an application suite for Linux that includes programs not found in traditional office suites, so the development team prefers to call it an "integrated work applications suite."

"Calligra is for the kind of people that are allergic to the word 'office'," said Boudewijn Rempt, maintainer (head developer) of Calligra's Krita sketching program.

Calligra 2.5 is the second stable version of the suite. The team also released a QML (Qt Modeling Language) -based version for tablets and smartphones called Calligra Active, and said that Calligra Active will be the main mobile version of the suite going forward.

Calligra is currently developing a version of Calligra Active for Android, Rempt said.

Krita was updated with a new composition docker that is useful for storyboard generation. And Krita added textured painting and performance improvements among other enhancements, Calligra said on its website. Database program Kexi, diagram application Flow and presentation program Stage were also updated, as well as spreadsheet program Sheets.

Version 2.5 also brings improvements that benefit all the suite's applications. For instance the management of autosave files is improved, there is a new system for managing user profiles and there is a new system connecting shapes like diagrams or flowcharts, according to Calligra.

Rempt said that Calligra could still use some other improvements. "The next step is Windows," he said, adding that the installer for Calligra for Windows isn't perfect. The main problem is finding open source developers that want to make Calligra available on Windows, according to Rempt. The Windows effort may require financial backing instead of relying solely on volunteers, he added.

Calligra's actual user base is hard to estimate, Rempt said. However, at the moment there are between two and three million Nokia N9 phones that use a documents app based on Calligra. Krita is downloaded about 20,000 times every month on a Windows machine, and the experimental Windows installer for the whole suite is downloaded about 5,000 times a month, he said, Besides that, Kubuntu, a popular Linux distribution, has decided to make Calligra a default application, according to Rempt.

One of Calligra's biggest competitors is open source office suite LibreOffice, according to Rempt. "That really is a good and mature suite," he said, adding that while Calligra currently has a fraction of LibreOffice's user base, it has more applications and is not as dependent on big vendors such as Novell and RedHat as LibreOffice is.

That Calligra does not attract as many users as LibreOffice is "perfectly alright at this moment", said Rempt, since the team would like more time to improve the functionality of the suite.

Calligra is a relatively young but promising suite, said Michiel Leenaars, vice chairman of the OpenDoc Society, an organization that promotes the use of ODF and open document standards in the Netherlands. "The market for Office suites is dominated by a small number of solutions that originated from the eighties and nineties of the past millennium," he said in an email. Calligra does not have that ballast since it is from this century, it was forked from KOffice in 2010, and was built from scratch, he added

Calligra's Krita competes at the moment more with specialized software than with simpler editors in other office suites, Leenaars said. One other plus for Calligra is that it is developed keeping touch devices like tablets and mobile phones in mind, he added.

Open Source OS X and TextMate 2

TextMate is one of the most popular text editors available for OS X, but the second version has been “on the way” for so long that many considered it abandonware. It turns out they were not far off the mark. Recently, MacroMates released the source code for TextMate 2 under the GPL 3 license on GitHub. Will releasing the code breath new life into the beloved editor, or has it been sent out to pasture, where forsaken code goes to die?

It could be said that TextMate and Ruby on Rails started life together, shot into the spotlight with this video. The video was meant to explain the power of the Rails framework, but also showcased the features of TextMate. TextMate was adopted as the unofficial text editor of the Ruby community, and through it’s plugin architecture was soon extended to deal with code of all kinds. TextMate sold well, and it’s developer, Allan Odgaard, promised in 2006 that TextMate 2 would be available as a free upgrade. Excited by the abilities of the editor, as well as the promise of free upgrades, developers flocked to TextMate in droves.

Unfortunately, time dragged on and updates on the status of version were few and far between. Five years later, the spirits of those who were still using TextMate were lifted briefly when a public alpha was released. The alpha was, as most alphas are, buggy, and not at all a successor to the original TextMate. Fast-forward to August, and the long awaited text editor is finally here, but not in it’s final 2.0 stable release version, but as an open source project.

This should be a victory for the open source community. A popular commercial product released as open source for programmers everywhere to adopt seems like the perfect end to the story, but it reminds me of another open source Mac project; Letters.app. Letters was intended to be the mail client that we all wanted on the Mac; open source, powerful, and beautiful. However, while the project did have a good sized group of developers, and high profile people involved, the project quickly fizzled out and died a quiet, lonely death as it’s leads went to work on other projects. The Letters project revealed one aspect of the mindset of Mac developers; that open source is great, but there is still money to be made with commercial software. Could TextMate be headed to the same desperate fate of Letters? Possibly, but this story also reminds me of one other project.

Quicksilver is one of the most fantastic Mac applications available. More than just an application launcher, it is a modern command line, a keyboard centric control station for the Mac. Originally developed and released for free by Blacktree, the launcher slowly started to show signs of decline as the developer started to lag behind as the Mac OS was updated. Late in 2006, Quicksilver was released as open source, and quietly abandoned… for a while.

In 2011 a small group of developers picked up the source code for Quicksilver and started putting serious effort into modernizing it, as well as fixing some major bugs. A new domain was bought, qsapp.com, and life was rapidly brought back into the ailing application. Today, Quicksilver is once again the gold standard for Mac automation.

TextMate obviously shares more of a similar history with Quicksilver than Letters, but will that be enough? Quicksilver was brought back to life because a small team decided that they really loved that application, and that because of that love they were willing to spend serious time and resources on making it the best that they could. Now that Allan Odgaard has put down the code, who will be willing to take it up again? Will TextMate fail and die like Letters, or will it flourish into new life like Quicksilver? Only time will tell.

Tuesday, August 14, 2012

Red Hat to release enterprise-ready OpenStack package

OpenStack will run on the company's flagship Linux distribution, Red Hat Enterprise Linux

Red Hat plans to release an enterprise-grade version of the OpenStack open source software for hosting IaaS deployments. The company has posted an unsupported preview edition of the package, ahead of its full commercial release expected in early 2013.

"From the Red Hat perspective, we feel the next release of OpenStack will be the right one to begin offering enterprise-grade services," said Brian Stevens, Red Hat CTO and vice president of worldwide engineering. "With the preview release, customers can get experience in operationalizing and deploying [OpenStack] and, most importantly, get their voices heard before our product is done."

Red Hat's release of OpenStack will run on the company's flagship Linux distribution, Red Hat Enterprise Linux. This version has been tested to work on RHEL 6.3 and requires Red Hat Enterprise Virtualization (RHEV) to operate. The company has already started working with a select group of customers that are trying the software.

The first commercial release will be based on the upcoming Folsom release of the OpenStack, due in September. The preview edition, in addition to OpenStack, will also include a number of Puppet modules to ease configuration. The commercial release will also come with an installer and greater integration with Red Hat's CloudForms hybrid cloud management software as well.

Begun two years ago by NASA and Rackspace, the OpenStack project is an effort to create a stack of open source software that can be used to provide IaaS cloud services. A modular software stack, OpenStack consists of separate programs to provide compute, object storage, image management, and other needed services for running cloud operations. The project rapidly gained popularity, attracting at last count the development efforts of over 3,300 programmers and 185 companies.

Red Hat has been devoting increasing amounts of its engineering efforts to open source cloud software projects. In April, Red Hat joined the OpenStack Foundation, which will shortly take reins as the governing body for maintaining the OpenStack project. Currently it is being managed by cofounder Rackspace, which wants to move the project to a more vendor-neutral party. In April, the project leaders released a survey that found that Red Hat was the third largest contributor to the project, after Nebula and Rackspace.

Pixar Releases Open SubDiv On An Open Source License

Most people can probably agree that Pixar is one of the most influential animation studios of all time. Their films have been not only critical and commercial hits, but important to the progression of animation technology as well. The technology Pixar uses in their films is some of the most impressive in the business. Now you can use it yourself for free.

Pixar has decided to open source their Subd evaluation code. It’s called Open SubDiv and it’s “a set of open source libraries that implement high performance subdivision surface evaluation on massively parallel CPU and GPU architectures.” With the release, Pixar hopes to “encourage high performance accurate subdiv drawing by giving away the “good stuff”.”

This is a huge deal for both Pixar and the development scene as a whole. By making their software open source, Pixar opens the doors to programmers of all backgrounds to help improve it and change the software.

It’s also big news for hobbyist animators and programmers because Pixar has released the code under the Microsoft Public License. Animators and programmers can release work made with Pixar’s code for non-commercial and commercial use. It wasn’t just enough that Pixar released their code, but they’re letting people make money off of it too.

The software is currently in beta, but Pixar will keep putting out new updates over time. The source code is available to all at GitHub. I can’t wait to see what amateur animators do with the software. If this release goes over well, they might start to release other software as well.

Wednesday, August 8, 2012

How MySQL database keeps tidy by Mozilla

As an open source company, Mozilla developers make a lot of different versions of software code each day, and part of Sheeri Cabral's job to keep track of them all: which ones work, which don't, how many times they've been downloaded, and which have a bug that needs to be fixed.

To do that, the makers of the Firefox browser have a MySQL database, the common open source structured database system, which organizes the information in a table format. A few months ago Cabral, who is a database administrator and architect for Mozilla and a MySQL community contributor, began running into issues as the database grew to over 100GB. "If that database doesn't work, the downloads aren't available," she says, emphasizing the importance of the database.

Typically the solution to such a problem is to throw more compute capacity at the server housing the database, or potentially switching from hard drive spinning disks to solid state drives, says Paul Burns, an analyst at Neovise. But Cabral found a different solution: Instead of having this one single database, Mozilla has in effect virtualized its database by splitting it up into a group of clusters, each holding a portion of the database. Using technology from a company named ScaleBase, now when a query is made the ScaleBase software identifies the cluster where the data is stored so that the entire database doesn't have to be searched. This speeds performance without adding additional hardware. "This is not an easy thing to do," Cabral says, "but they seem to have done it and it's working."

MORE MOZILLA: Can Mozilla right the ship?

NOT ALL FUN AND GAMES: Inside the IT challenges of sports and entertainment

ScaleBase was born out of an Israeli consultancy a few years ago. After receiving several requests from customers to help scale MySQL databases for Web-based and mobile applications, the idea of sharding the database, or splitting it up into smaller bite-size chunks, was tested. It worked for a variety of customers, so a business was born to sell the product on a wider scale, says Paul Campaniello, VP of global marketing for ScaleBase. "People have virtualized machines, storage, operating systems," he says. "No one has really virtualized the MySQL database yet."

After receiving venture capital funding two years ago, the company brought ScaleBase into GA this year and since then it has built out its management team, including bringing on now-Executive Chairman Ram Mester, a former vice president at IBM's information management division where he led the database management, security and optimization practices.

ScaleBase describes its flagship Data Traffic Manager as a load balancing tool that sits between an application and the backend database used to store data for the program. When using the software for the first time, ScaleBase will atomically analyze the database and partition it up into multiple instances. Once a query is made, it directs the client requests directly to the appropriate instance within the database. Pricing is based on the size of the database being managed.

ScaleBase's sharding technique is not a novel concept, but it is one of the first implementations of the technology in databases, and specifically MySQL databases, says Burns, the Neovise analyst. "These databases haven't traditionally been something that you break up, but ScaleBase takes a sharding algorithm to it and makes multiple copies of the data on different servers," he says. "They've made sharding easy to do and automated it." The technology could be helpful for anyone running a MySQL database, which is common in the open source world, and it could be especially helpful when those databases begin to scale to large sizes, Burns says.

Cabral and Burns have some reservations, though. The open source community has very much of a do-it-yourself attitude. Some open source MySQL database administrators may not be interested in purchasing a product to handle the functionality and would instead build solutions in-house or rely on an open source community to supply the technology. Cabral says she explored that option, but there just weren't open source community tools available with the functionality that ScaleBase had. To expand, ScaleBase does have an opportunity to support other open source databases, or it could even branch out to managing other types of databases, including tackling the growing big data problem of unstructured data.

Open-source movements butt heads over logo

Open Source Initiative and Open Source Hardware Association trying to bridge differences over similar-looking logos

A gear logo proposed to represent and easily identify open-source hardware has caught the eyes of the The Open Source Initiative, which believes the logo infringes its trademark.

The gear logo is backed by the Open Source Hardware Association (OSHWA), which was formally established earlier this year to promote hardware innovation and unite the fragmented community of hackers and do-it-yourselfers. The gear mark is now being increasingly used on boards and circuits to indicate that the hardware is open-source and designs can be openly shared and modified.

OSI has now informed OSHWA, which is acting on behalf of the open-source hardware community, that the logo infringes on its trademark. The issue at stake is a keyhole at the bottom of the open-source hardware logo, which resembles a keyhole at the bottom of the OSI logo. The gear logo was created as part of the contest hosted by the group that founded OSHWA, and the mark was released by its designer under a Creative Commons license, opening it up for the community to use on hardware.

More than a year on, OSHWA is still in talks with OSI and both believe a resolution is near. The issue has sparked a debate on OSHWA's website, with some community members accusing OSI of policing and asking the open-source hardware organization to steer clear of OSI's licensing terms. OSI has established logo usage and trademark guidelines on its website.

OSHWA is also engaging the community on whether it should facilitate creation of a new logo or license the gear logo as a derivative work from OSI. OSHWA could theoretically argue OSI's claims in court, but it would be a waste of resources and create a wedge between open-source organizations, whose main objective is not to fight but to cooperate, wrote OSHWA president Alicia Gibb in the blog entry.

OSHWA follows the open-source ethos of working together to tweak, update and share physical hardware designs with the goal to improve products. The fledgling organization is still trying to sort out legal and licensing issues, and observers said this could be a litmus test for OSHWA's viability and the gear mark's use for hardware certification.

The gear logo has gained in popularity, and OSHWA director Nathan Seidle would love to see it stick around.

"We want to see the gear logo stamped onto bicycle parts, on the back of a wrist watch, on the bottom of a chair," said Seidle, who is also CEO of Colorado-based SparkFun Electronics, in an e-mail.

OSI, which is more grounded in software, tends to take a conservative approach to trademarks and legal discussions, which makes communication difficult, Seidle said. But OSHWA does not want trademark or legal battles with anyone, Seidle said.

W&L's R.E. Scholars Design Software for Surveillance Drones

In a war zone, an abandoned building may be filled with hidden hazards. Bombs. Booby traps. Snipers. Thanks to software designed by Washington and Lee computer science professor Simon Levy and his summer research students, American soldiers may soon be detecting these dangers using miniature surveillance drones.

Levy and his students — junior Suraj Bajracharya and sophomore Bipeen Acharya, both from Nepal, and junior Olivier Mehame of Rwanda — are working with Advanced Aerials, a Navy contractor, to develop the software, which will be embedded on a wrist-mounted controller. Their program would allow soldiers to tap out simple commands on the controller’s touchscreen.

“Imagine a scenario where they’re trying to figure out what’s in a particular building, and they don’t want to run in there. There may be explosive ordnance…or they may be under attack,” explained Levy. “So the idea is, you can take this [drone] out of a pack and toss it in the building and have it flying around looking for things, with cameras on it.” The cameras would record a live feed of the building’s interior.
Levy and Advanced Aerials, a VTOL UAV rapid prototyping company north of Charlottesville, will demo the software for the Navy this fall. The project offers Levy’s students a unique research opportunity because they are building a commercially viable product. “There’s actually a customer who wants this technology,” said Levy.

The students, all Robert E. Lee scholars, spent the first part of the summer coding commands for the drone. Their goal? To keep the coding as clean and simple as possible. They also wanted to create a visually appealing touchscreen.

Levy’s students started coding as soon as the project began. This would not have been possible a few years ago, said Levy. Until recently, the only small computing devices available were micro-controllers. These special-purpose computers have their own language and limitations, and the students would have needed time to learn their computer’s particular platform.

Programming today is much simpler. “You can get an entire computer that’s already good to go, with programming on board. The students don’t have to learn much beyond what they’ve already learned in their computer science courses,” said Levy.

Technology is also more affordable than it was in the past. “We’re currently working with what’s called commercial, off-the-shelf technology,” said Levy. “Most of this stuff costs between $20 and $200. It’s very inexpensive technology. You can buy it at Amazon or some supplier online. I’m just using, basically, little $20 webcams for this.”

According to Levy, the drones and controllers to be used in the demo are supposed to be easily portable and disposable. Part of the challenge for the students was to keep the coding clean and efficient. Levy reviewed their work daily to correct mistakes. “We would be doing it the long way, and he would come in and tell us to use a function,” said Acharya. These functions, or shortcuts, removed repetitive codes and kept the program lean.

The team’s software is highly adaptable and can be embedded in a variety of small, open-source computing devices that operate on a Linux platform, from BeagleBoard to Raspberry Pi to Gumstix. “The code we write for one of the devices can work on any of the other devices,” said Bajracharya. They decided not to design an iPhone-ready program because the Apple device requires a specialized code not easily adaptable to other platforms.

A prototype drone was unavailable in June, so the team tested commands on a server that acted as a substitute for the actual robot. Levy will test the software on a drone later this summer. In September, Levy and the head of Advanced Aerials, Bert Wagmer, will demo the device for the Navy. “If that works out, we’ll have a bigger product due a year out from that,” said Levy.

The students agreed that they learned a lot about programming. “It made me get more interested in programming because I hadn’t worked with designing things graphically as an interface,” said Mahame. He also enjoyed “doing the coding to connect to the graphical user interface and making it more dynamic. I think this project was really cool.”

Gaining hands-on experience with a high-level commercial project was another benefit for his research assistants, said Levy.  “You can come to a place like W&L and get some really interesting projects under your belt very quickly,” said Levy. “This industry is really cooking. I know they won’t have any trouble finding a job or going to grad school.”

Explore the Power of Data-Centric and Data-Driven Android Applications with Android Database

Packt is pleased to announce Android Database Programming, a book that strives to weave together the exciting worlds of mobile programming and data management with both conceptual and practical code-filled examples.

Android is a mobile operating system based upon a modified version of the Linux kernel. The Android Open Source Project (AOSP) is tasked with the maintenance and further development of Android. The Android operating system software stack consists of Java applications running on a Java based object oriented application framework on top of Java core libraries running on a Dalvik virtual machine featuring JIT compilation.

Android Database Programming is a book meant to prepare its readers for this new world we live in. It's a book that not only strives to teach its readers all the different data storage methods available, but which also strives to illuminate the strengths and weaknesses of each method – placing a greater emphasis on conceptual thinking.

By the end of this book, readers will be able to craft an efficient, well-designed, and well thought out data-centric application

This book is ideal for readers with a wide range of technical backgrounds, ranging from those with a limited Android programming background but who may be well-versed in database designs, to those who are experienced in Android programming but who may need to brush up on database schemas and database queries. The book is out now and available from Packt in print and popular eBook formats. To read more, please visit the Packt website.

Tuesday, August 7, 2012

Chinese Researchers Program Tesla GPUs with OpenACC

NVIDIA today announced that the OpenACC programming standard has enabled Chinese researchers to dramatically accelerate the DNADist genomics application,1 which is used in the early stages of development of treatments for genetic conditions, such as Down syndrome, hemophilia, cystic fibrosis, and sickle-cell disease.

Using the CAPS enterprise OpenACC compiler, Shanghai Jia Tong University researchers accelerated the DNADist application by 16 times on an NVIDIA® Tesla® GPU-based system by adding just four simple hints -- known as "directives" -- to the application code.

DNADist, a distance-matrix application for studying the genetic relationships between various species over evolutionary history, enables researchers to extract information from sequenced DNA data by reading nucleotide sequences, which may potentially lead to a greater understanding of the causes of and treatments for pervasive genetic diseases. Accelerating the DNADist application allows researchers to study a significantly larger range of input data and obtain actionable information earlier in the disease treatment research process.

A programming standard for parallel computing using directives, OpenACC is designed to enable millions of researchers around the world to easily take advantage of the transformative power of GPU computing. It provides the easiest way for users, with or without extensive parallel programming expertise, to accelerate their research in a matter of hours using familiar programming models.

Roche Impressed with OpenACC, Power of GPU Acceleration
By quickly delivering game-changing application acceleration with minimal effort, OpenACC provides world-leading pharmaceutical companies like Roche with the ability to research, identify and develop more effective drugs faster and more cost-effectively.

"I am astonished at how quickly and easily OpenACC unlocked the power of GPU acceleration for DNADist, which is one of our most critical applications," said Steve Pan, project director, Roche Pharma Global Informatics. "The potential impact of GPUs is priceless because getting our products to market faster, even one day earlier, will save more lives."

"Extracting meaningful information from the vast collection of available DNA sequencing data requires ever-increasing amounts of computational power," said Sumit Gupta, senior director of the Tesla business at NVIDIA. "OpenACC enables researchers to quickly and easily leverage the enormous performance of GPU accelerators to analyze mountains of genomics data. This dramatically reduces the time to study biological systems, and potentially leads to the development of more effective next-generation medicines."

A large and growing number of researchers and engineers are using OpenACC-supported compilers and hybrid CPU/GPU computing systems to accelerate all types of applications, including CAD/CAM, image processing, materials science, molecular dynamics, quantum chemistry, and many other applications. In many cases, users are reporting that they have achieved as much as 5-10X or faster levels of acceleration in as little as a few hours of work.

Saturday, August 4, 2012

VirtualBox 4.2.0 Beta 1

VirtualBox is a general-purpose full virtualizer for x86 hardware. Targeted at server, desktop and embedded use, it is now the only professional-quality virtualization solution that is also Open Source Software.

Some of the features of VirtualBox are:

Modularity. VirtualBox has an extremely modular design with well-defined internal programming interfaces and a client/server design. This makes it easy to control it from several interfaces at once: for example, you can start a virtual machine in a typical virtual machine GUI and then control that machine from the command line, or possibly remotely. VirtualBox also comes with a full Software Development Kit: even though it is Open Source Software, you don't have to hack the source to write a new interface for VirtualBox.

Virtual machine descriptions in XML. The configuration settings of virtual machines are stored entirely in XML and are independent of the local machines. Virtual machine definitions can therefore easily be ported to other computers.

New Language For Image Processing is Halide

Halide is a new open source language designed specifically for image processing and computational photography. It not only makes it easy to implement photo algorithms, it also makes them run fast by semi-automatic parallelization.

Algorithms that work with images are ideal for parallel implementation because they usually work with small isolated blocks of data that means the task can be parallelized without worry about interactions. The only problem is that even converting something that is ripe for parallelization from serial code to something that runs on today's confusing architecture of CPU cores and GPUs is difficult.

Halide is a new functional programming language from MIT, (with help from Stanford and Adobe) that allows you to specify image processing algorithms, mostly block convolution methods, more easily and without having to worry about how the algorithm is implemented. A second section of the program then provides a general description of how the algorithm should be parallelized. It not only describes how the algorithm should be split up among computational elements but how to organize the data to keep the processing pipelines running at maximum efficiency by avoiding restarts.

The easiest way to understand the general idea is to see a simple example (taken from the paper):
Func halide_blur(Func in) f
 Func tmp, blurred;
  Var x, y, xi, yi;
  // The algorithm
  tmp(x, y) = (in(x-1, y) +
          in(x, y) + in(x+1, y))/3;
  blurred(x, y) = (tmp(x, y-1) +
          tmp(x, y) + tmp(x, y+1))/3;
  // The schedule
  blurred.tile(x, y, xi, yi, 256, 32)
        .vectorize(xi, 8).parallel(y);
  tmp.chunk(x).vectorize(x, 8);
 return blurred;

The first part of the program defines a simple 3x3 blur filter split into a blur horizontal followed by a blur vertical step. The last part of the program, the schedule specifies how the algorithm can be treated in a parallel implementation. The Schuyler is machine specific and has to be changed to get the best performance out of a particular processor pipeline.

Wednesday, August 1, 2012

Chaos Monkey released by Netflix

Chaos Monkey, the first member of Netflix's "Simian Army", has been released as open source code. The software runs on the Amazon Web Services (AWS) cloud platform and can be used for stress testing cloud deployments. Chaos Monkey will randomly disable virtual machines in Auto Scaling Groups (ASG) and give the support engineers a chance to test contingency plans for outages under realistic circumstances. This gives administrators a chance to learn from the problems encountered. Chaos Monkey's schedule can be configured, but by default it runs during work hours to give the engineers a chance to be notified and react quickly.

According to Netflix, the idea behind Chaos Monkey is to make sure engineers and administrators are prepared for problems and can solve them efficiently when they occur. In its announcement, the company summarises the reasons for deploying the software as follows: "Failures happen and they inevitably happen when least desired or expected. If your application can't tolerate an instance failure would you rather find out by being paged at 3am or when you're in the office and have had your morning coffee?"

Netflix has been using this kind of approach for a while and says that over the last year, Chaos Monkey has disabled over 65,000 virtual machine instances in its network. In many cases, AWS handles these outages seamlessly and nobody notices a problem but the approach has also led to bugs and problems being discovered which could then be eliminated.

To make the approach less painful, Chaos Monkey has several configuration options. It can be used in both opt-in and opt-out mode; this is configured for each application the service is run on. This allows organisations to test the software without putting their whole infrastructure at risk. Administrators can also tune the probability with which Chaos Monkey terminates instances. Probability settings range from 100% (one instance a day) to 20% (one instance per week on average).

The source code for Chaos Monkey is available from GitHub under the Apache 2.0 Licence. More information on the software is available from its documentation wiki, which is also hosted on GitHub.
Netflix is planning to release more members of its Simian Army in the future. The next candidate will most likely be Janitor Monkey, a program that helps clean up unneeded assets from AWS environments and thus save running costs.