IBM's software business contributes $20 billion of IBM's revenue and 40 percent of its profits. Suffice to say, it's an important part of Big Blue's market strategy to ensure that the software division performs at or above expectations every year.
Steve Mills, senior vice president and group executive, joined IBM in 1974 and has helped shape the software business as its grown to more than 50,000 employees, including 25,000 software developers and 15,000 sales and technical support personnel in more than 150 countries. That total includes the products and personnel from the more than 50 companies IBM has acquired since 2000.
Steve Mills, SVP IBM software.
In 2009 alone, IBM acquired no fewer than five companies: Lombardi, a privately held provider of business process management (BPM) software, data discovery software firm Exeros, database security firm Guardium, security provider Ounce Labs, and analytics provider SPSS.
The company also launched a number of cloud-oriented products and services in 2009, including a new lab in Hong Kong, a Cloud Academy program designed to help educators and students pursue cloud-computing initiatives and better take advantage of collaboration technology in their studies; and a number of additions to the LotusLive hosted collaboration service.
In an exclusive interview with CNET News, Mills shared how the company is looking at the technology landscape in 2010 and beyond.
Question: Software strategy is obviously an important part of IBM's business model. How long of a time-to-market horizon does IBM look for with new software products?
Mills: We tend to look at product groupings and product families--customers don't use a single product. Enterprises are looking for complete solutions even if they don't buy them all at one time. That means that we're looking for leverage in software we create or acquire--how do the products complement each other and how can plan ahead for what customers need.
As you probably know, IBM is big on process (laughs). The software business is no different, and we have a method to how we develop markets: customer, volume, revenue, and profit. You have to set the baseline to figure out how the product fits into the marketplace, you learn this from talking to customers. Time to market and rapid iteration are important aspects that come into play in relation to the other components but you always learn more in the market from customers than in the lab.
When we look at how well a piece of software is doing, as well as its potential, we look at volume of customers, industries, installed base, etc., and what's the trajectory of the installation. Growth objectives are unique to each product, and you rise on a series of plateaus. You have to fill the gaps that inhibit the growth. And it's not always obvious. We pay a lot of attention to our customers and also the trends in the market.
How does cloud computing play into your technology focus areas?
Mills: Cloud computing is a transformative part of the Darwinian IT phenomenon. Many companies are not interested in operating their own infrastructure as they don't see it as a competitive advantage. In which case they want to get the job done at a lower cost. Businesses realize they can grow because of IT and they want to continue to use IT to keep things growing, but that doesn't mean they need to own and manage every piece of their infrastructure.
Companies like American Express, Salesforce.com, and ADP are great examples. We see those types of system designs and customer interactivity as common models. IBM has long offered managed business process services and supported other big enterprise services.
These offerings make logical sense, but they don't always solve every problem. The hybrid public/private model is very appealing to our customers and not dramatically different than using a hosting provider.
Not everyone will be comfortable with the cloud model--it's all part of a continuum. There will be Salesforce.com on one hand, and on the other customers that run everything behind the firewall. Success doesn't mean that corporations will push everything into the cloud but the inherent cost-benefits are there and more companies are interested. That's part of the evolution.
How do you look at open-source projects/products/companies?
Mills: The hybrid companies like Red Hat have interesting models for open source. They take all the code and put it together for you, but we tend to look at open source as building blocks for larger solutions. IBM ingests a lot of open-source code and we provide a huge amount of development and engineering expertise to the various projects that we support--like Linux and the Apache server.
We focus a lot of our energy on open standards and platforms. And if there are open source projects that we believe in we'll invest resources to support them.
... Read more
One of the cloud-related trends that developers have been paying attention to lately is the idea of "NoSQL," a set of operational-data technologies based on nonrelational technology.
These technologies do not replace the relational database but rather add a new tool to the developer toolbox. Business intelligence database technologies such as Aster Data, Greenplum, Neteeza, and Vertica do not completely replace the traditional relational database but rather use nonrelational databases to augment the software.
RedMonk analyst Stephen O'Grady wrote recently that NoSQL "adoption was inevitable because, just as in every other walk of life, there are different tools for different jobs in the technology world." NoSQL may not be exactly the right moniker, but the companies and developers behind these tools have legitimate substantiating points as to why the approach is right.
According to Dwight Merriman, CEO of 10gen (the commercial team behind the open-source MongoDB project), we'll see NoSQL complement existing applications for the foreseeable future.
The broad range of NoSQL tools that include projects like Cassandra, CouchDB, Hadoop, Memcached, and MongoDB bring to bear a number of technical advantages--even if no one tool does everything.
Horizontal scalability
Horizontal scalability, readily achievable for NoSQL solutions, fits incredibly well with cloud computing and general trends in computer architecture--toward more CPU cores rather than faster ones.
Performance
In some cases, the simplification of design of these solutions, as well as lack of normalization of the data, yields better performance. This often results in the developer not coding around the database.
Ease of assembly
Some NoSQL solutions facilitate easier software development. Mapping object data to JSON, a JavaScript data interchange format, is far less complex. The "schemaless" nature of many of these products is an excellent fit with agile development methodologies.
The typical software system of moderate complexity has many real and conceptual internal data stores. No one technology will be the right solution for all problems.
Forward-looking organizations should look at which technologies are appropriate for different data subsystems and begin to evaluate NoSQL technologies for appropriate projects.
Most businesses seek competitive advantage through some kind of change. Whether they want to beat the competition to market with a new service or introduce new product categories, disruption is the norm.
The challenge in today's IT-centric world is that every one of those disruptions requires a software change, introducing the potential for downtime and lost revenue.
Change control and the associated risk mitigation is a big problem that every large organization faces. Last year, the London Stock Exchange crashed during a software change and was down for more than seven hours, costing traders millions, if not billions of dollars in lost business. This year we've had high profile outages at Salesforce.com, Twitter, and Amazon's EC2, among others, affecting tens of millions of people.
No company is immune to this type of risk and companies that want to stay on the leading edge need to embrace these changes in order to stay competitive.
Coverity, a software integrity firm perhaps best known for its SCAN project of open-source software sponsored by the Department of Homeland Security thinks it has the preventive medicine to help organizations avoid the inevitable errors, defects, and failures that software change can introduce.
The company's latest release, Coverity 5, promises to mitigate the business risk of software changes across an organization's entire software portfolio. It claims this is the first product that lets developers automatically map and identify how a single defect impacts multiple code bases, projects, and products. Through a unified defect management interface, it also can help organizations review, prioritize and triage their C/C++, Java and C# defects in a single work flow.
This approach lets an organization quickly answer five key questions of software change management:
- How do I find defects introduced by changes?
- How do I know the severity of new defects?
- How do I know the impact to my code, my projects, my products?
- How do I fix them fast?
- How do I know I fixed them?
Today, market opportunities are changing faster than businesses can deliver. When your organization changes software, how quickly can answer the five questions above?
The Tennessee Valley Authority is the nation's largest public power provider serving approximately 9 million consumers in seven southeastern states. The organization also happens to be a big supporter of open-source projects, including Hadoop, a tool designed for deep analysis and transformation of very large data sets.
Earlier this year, the Tennessee Valley Authority (TVA) announced that it open sourced its data system used to collect data from smart grid devices called Phasor measurement units (PMUs). The data collection system is known in the industry as a Super Phasor Data Concentrator (SuperPDC), which can be used to determine the health of a power grid.
The open-source version of the SuperPDC is now called the "OpenPDC." I spoke to both Ritchie Carroll (RC), the project's creator, and Josh Patterson (JP), the person responsible for introducing Hadoop to the project, to discuss what the OpenPDC is and why TVA turned to Hadoop in building the system.
What sort of data volumes are you working with?
RC: Currently there is around 20 TB of archived data, we expect this to grow quickly as a result of the SmartGrid stimulus funding which includes the addition of 850 phasor measurement devices. This may well grow the archive to half a Petabyte within the next few years.
How is this data currently captured and managed? Is any data discarded?
JP: Data is collected directly from field devices at 30 times per second. This data is then time-aligned and processed in real-time--all data gets captured into a binary data file as time-series data for mass processing by Hadoop.
RC: No data is currently discarded, if we get to the point of needing to discard data because of cost--this will be a decision based on weighed importance of collected data. It is likely the data around major events will never be deleted because it will always be valuable for future student researchers. There is also value in being able to go back in time and look for newly discovered event signatures to see how long they might have been occurring.
... Read moreInfluence in open-source development communities is earned through years of writing and sharing great code. Perhaps not surprisingly, then, influence in the business side of open source is also gained through sharing expertise, and not necessarily from making mountains of cash.
At least, that's the lesson I take away from MindTouch's inaugural survey of 50 open-source business executives. MindTouch, an open-source collaboration company, has spent the last few months surveying executives within the commercial open-source community, asking them to name the most influential people within the commercial open-source ecosystem.
The result is effectively an all-star list of open-source business executives. The top five are as follows:
- Larry Augustin, CEO, SugarCRM
- Matt Asay, vice president of business development, Alfresco (and fellow CNET blogger)
- Mårten Mickos, entrepreneur-in-residence, Benchmark Capital, and former CEO, MySQL
- Jim Whitehurst, CEO, Red Hat
- Dries Buytaert, co-founder and CTO, Acquia
The full list is available here.
The common theme running through these top-five vote getters is how open they've been with their peers. Larry Augustin sits on several boards of open-source companies, but he also frequently speaks at industry events and has been involved in open source from its inception.
Matt Asay, my friend and fellow CNET blogger, sits on more than 10 open-source advisory boards, chairs the Open Source Business Conference, hosts an informal get-together every year (called Open Source Goat Rodeo--don't ask why), blogs at an unhealthy rate for CNET on open source, and has actively helped a range of aspiring open-source entrepreneurs understand the mechanics of running an open-source business.
Mårten Mickos made the world safe for the $1 billion open-source acquisition, but he has also traveled the globe speaking at open-source events and is very generous with his time, sharing know-how and best practices with other open-source executives.
Jim Whitehurst, breaking the typical Red Hat mold, has been active in industry events, has hosted a range of dinners and other small-scale, intimate events with open-source executives. He is amazingly accessible, given that he has a fast-growing open-source company to run. It's unfortunate that Whitehurst is the only Red Hat executive to make the list; Red Hat should follow his lead and be more permeable to its peers. Its influence would grow accordingly, just as Whitehurst's has.
Finally, there's Dries Buytaert, who blogs frequently on his project, Drupal, but also regularly attends and speaks at industry events. He has also been active behind the scenes, working with other open-source companies to share information on how to optimize community development.
Open-source code becomes valuable when you give it away. The same holds true for open-source business expertise. There are individuals who have made more money than these with open-source software, but in terms of influence, the more you share, the more influential you become.
What do you think? Who else should be on the list? Who influences you?
If you need further proof that open-source applications are ready for prime time, take today's news from open-source business intelligence company Jaspersoft, which announced that British Telecom is using its business intelligence suite to support more than 8 million voice mail subscribers.
BT and Unisys, a longtime Jaspersoft partner, say they chose Jaspersoft for its modular design, which reduces maintenance and cost and gives them customization abilities that improve capacity planning.
The deal with BT also represents how important a solid channel strategy is for open-source software companies.
Jaspersoft CEO Brian Gentile has in the past mentioned that the BI market is heavily influenced by a few technical aspects, including SOA/Web services (and overall componentized design), in-memory analytics, integrated search, and the use of rich media services to provide more compelling (Web-based) user experiences.
The other obvious factor in the shift to open-source BI (and open source in general) is the economics behind the applications and ongoing operations. And perhaps more important is the control--both on-premise and online. As consultant Carlo Daffara noted recently, "the critical aspect is being able to assess this control and weight if the lack of control is compensated by the features you get (which is reasonable) or what kind of risk you are accepting in exchange."
In conversation earlier today, Gentile further asserted, "open-source software is both augmenting and displacing aged, proprietary solutions across industries and at the largest companies. British Telecom is just one example of a company that has realized traditional, proprietary software is just too expensive and too complex. The most aggressive companies figured this out long ago. But now, with heightened economic pressures and the feature maturity of open source, the secret is out and the choice is clear."
There was a time when people would debate whether or not open-source software was reliable enough to support a small office. Those days are long gone. The down economy and maturity of open source are the perfect storm for major disruption.
MySpace today announced a new open-source project called Qizmt, a distributed computation framework developed by its data mining team.
Qizmt is based on the MapReduce distributed processing framework, well-known as a core part of Google's search indexing infrastructure. Qizmt, however, runs on large clusters of Microsoft Windows servers, an interesting sidebar to a computing style we most commonly associate with commodity Linux machines.MapReduce is a programming model and an associated implementation for processing and generating large data sets. Users specify a map function that processes a key/value pair to generate a set of intermediate key/value pairs, and a reduce function that merges all intermediate values associated with the same intermediate key.
I spoke with Java architect and distributed systems expert Eugene Ciurana about MapReduce and he contends that "indexing large amounts of unstructured data is a difficult task regardless of the technologies involved. MapReduce provides a simple, elegant solution for data processing in parallelized systems."
As more sites move to manage large data sets, the uptake of frameworks like MapReduce and projects like Hadoop is sure to grow. And along with the growth of the data is the growth of the market opportunity. Open source is a great way to expand and enlarge the adoption curve as users figure out the best way to use these new tools.
Qizmt is currently being used in the MySpace "People You May Know" feature, and will soon expand to user recommendations and other new areas.
Follow me on Twitter @daveofdoom.
Hadoop is the popular open-source implementation of MapReduce, a powerful tool designed for deep analysis and transformation of very large data sets. It enables you to explore complex data, using custom analyses tailored to your information and questions. It's also one of the most buzz-worthy, talked about open-source projects around.
I spoke with Christophe Bisciglia, Hadoop World organizer and founder of Cloudera, to ask some questions about this inaugural event. And by the way, if you're interested in attending, click on the link in the answer to question No. 5. (My readers get a 25 percent discount if you register before September 15.)Q: How can you explain the buzz around Hadoop? It's deafening.
... Read moreIn a new study on open-source adoption in the business intelligence (BI) market, it's becoming clear that both the benefits and shortcomings of open source software are nearly universal across all technology segments.
According to the study by Third Nature (sponsored by Jaspersoft and Infobright), "the top reason for adopting is still cost savings, although reduced vendor dependence and ease of integration were close to the same level. The limiting of vendor technology lock‐in and freedom from deployment restrictions were key elements of reducing vendor dependence. Some companies used open source deployments as a means of keeping their incumbent vendors honest."
The statement above is hardly unique to BI, but is perhaps germane if only because BI solutions have for so long been hugely expensive and proprietary. In past discussions with Jaspersoft CEO Brian Gentile, he has stated that BI is the least agile piece of the enterprise puzzle. Open source BI solutions mean that customers can take matters into their own hands.
The study also makes some recommendations on evaluating BI and data warehousing tools, that again are relevant for any open source product.
- Don't focus solely on cost savings.
- Make open source the default option
- Plan to augment, not replace, existing software with open source.
- Consider developing open source policies.
- Evaluate open source like any other software.
In the end, software needs to solve business problems. The adoption of open source gives users more alternatives to address their issues, be it cost reduction, increased business agility or just a new way to manage their data.
Follow me on Twitter @daveofdoom.
The Linux Foundation recently released an updated study of Linux development statistics that reveals interesting statistics relating to who actually writes the kernel that allows others to build on top.
More than 70 percent of total kernel contributions come from developers working at large companies including obvious participants like Red Hat, IBM, Novell, and Intel as well as other less obvious small companies such as Parallels.
- Red Hat: 12.3%
- IBM: 7.6%
- Novell: 7.6%
- Intel: 5.3%
- Independent consultant: 2.5%
- Oracle: 2.4%
- Linux Foundation: 1.6%
- SGI 1.6%
- Parallels 1.3%
- Renesas Technology: 1.3%
- Academia: 1.2%
- Fujitsu: 1.1%
- MontaVista: 1.1%
- MIPS Technologies: 1.1%
- Analog Devices: 1.0%
- HP: 1.0%
Another interesting fact is the rate of development and constant refactoring of the kernel code. An average of 10,923 lines of code are added with an average of 5,547 lines removed every day, ensuring that the code is high quality and relevant for the most important implementations of the kernel.
... Read more




