Strategies for Efficient IT at UC

Summary

I propose increasing the use of Open Source Software (OSS) in 3 areas:

In thin client clusters where end users don’t have to know how to configure computers (especially in many classrooms and administrative computing). For distance learning, nxclients offer performance equivalent to the expensive VMWare products at no cost for (commercially available) client or OSS servers (freenx or Google’s Neatx).
For infrastructure software. By infrastructure software, I mean anything with server in the name. Mail-, storage-, content management-, trouble-ticketing-, web-, groupware-, database-, scientific compute-, VPN-; append server to the aforementioned stems and you have an idea of what OSS can replace. Whenever a contract comes up for renewal, we should evaluate it in the context of what OSS can provide, especially in terms of scaling and long-term costs.
in building our own software. If we build our own software with proprietary technology, we lock ourselves into very long term dependencies on the companies whose primary motive is definitely not the same as ours.

Further, I suggest that using Linux is the only viable way of dealing with the increasing amounts of digital data. UC graduate students should be instructed in its use and utilities and encouraged to use it for their day-to-day use.

Finally, I suggest that instead of acting as unpaid salespeople and support technicians for Proprietary Software ($oftware) vendors, we negotiate that they provide their products for free or pay us to host their $oftware on campus. There are enough Open Source alternatives that:

this is a viable strategy
it would be a dramatic statement that should bring UC some significant publicity (good, for a change).

I do not suggest that faculty change their software preferences, but support should be available should they wish to evaluate OSS alternatives and these should be made available side by side with $oftware on UC servers.

1. Introduction

Gov. Brown’s budget requires further cuts in all aspects of UC spending (perhaps as much as $500M). While the changes suggested here will not address a majority of the deficit, they will help to make UC much more efficent going forward. This is certainly not the first time such suggestions have been made. Various faculty, whose expertise is in software development and research computing have made most of them before; I’m simply providing repetition and amplification.

It’s easiest to do nothing.

UC’s refusal to take OSS seriously echoes the Big Three Automakers' refusal to consider fuel efficiency improvement until they were forced into bankruptcy. Then, after the massive societal and financial convulsion, even some SUVs are getting 40mpg.

All transitions incur costs and discomfort. But do we drive off the cliff comfortably numb or do we take initiative and correct course before the cliff?

It is not just the low cost of OSS that makes IT more efficient. Anyone with experience with large software infrastructures knows that the initial price is usually a small (tho sometimes significant) part of the whole cost. Local and global configuration, testing, hardening, local rollout, educating users, upgrades, storage, reliability, security, scaling issues, storage, platform preferences, and dependencies all contribute to cost of rolling out a software infrastructure.

2. Low-Hanging FruIT at UC

2.1. Thin clients

Thin client computing are a mechanism that allows lightweight, efficient user-side displays to run software that they get from a server. It is much like the existing browser paradigm but it allows an entire desktop and application suite to be run on the server and be displayed on the client. The clients can be specialized low-power devices or PCs, with or without disks.

That the client devices can be locked down to specified levels recommends thin clients for any computing for which data security is a concern. FISMA and HIPAA requirements are already becoming concerns and thin clients can be configured to disallow writing to local CD drives, portable disks and thumb drives. In other situations where these devices need to be available, a central permissions configuration file can allow certain users to use them.

Thin clients also allow transparently portable desktops since any device which can connect to the server can display the user’s desktop with all her personal configuration intact. This is possible (or can be disallowed) to allow access for home use if the network bandwidth allows (I use VNC to administer a Macintosh in Nova Scotia from California; the NX technology is even faster).

For the great majority of administrative and classroom computing at UC, we should be using such thin clients rather than the usual fat/obese Windows and Macintosh clients. This technology is well-established, robust, more secure than Windows or Mac, easier to maintain and roll out, more energy efficient, and much lower cost. If many other first, second, and third-world countries can use it effectively, we should certainly be able to do so as well.

Linux thin clients can be implemented so as to run completely on a server, or to obtain their Operating System image from the server, depending on how much computation needs to be done on the client CPU. This allows classes that require heavy computation on the client side (engineering, statistics) to have a full client OS to run the computations on the client CPU, or act as a display for server-side software.

For distance learning, on campus or off, the NX protocol from NoMachine allows all platforms to connect to Linux servers which can provide all OSS and many $oftware products, but without the additional cost load to access them. MATLAB, SAS, Mathematica, SPSS, and many others can be run efficiently on single servers to service 10s of clients per CPU core. This is standard operationg procedure on many research clusters, which are required to run at peak efficiency.

2.2. Using OSS for Infrastructure software

The OSS approach encourages self-sufficiency and pays off especially with widespread use - exactly the case we see in UC, with its 10 campuses. Configuration is often done by human-readable configuration files that can be easily and freely shared. OSS tends to be simpler than proprietary software and once one group adopts and understands it, it can be easily (if not trivially) rolled out to further groups with little extra cost. This is definitely not the case with $oftware.

Increasingly, especially for infrastructure software, an OSS package is one of the top performing products. By infrastructure, I mean anything with server in the name - servers for:

mail
storage
backup
content management
web pages
databases
scientific compute services
security
and more.

This is a Million Dollar issue.

Ignoring these free packages costs us millions of dollars a year and weakens our long-term IT infrastucture, rather than strengthening it. It makes us dependent on entities that do not have our best interests at heart and it takes control of our infrastructure out of our hands.

In 2003 & 2004, California and Oregon promoted Open Source as part of their efforts to increase government efficiency. Both efforts were largely defeated by Microsoft lobbying, but the attempts to introduce this legislation still echo. See below for other governments which have stated that they will no longer use Microsoft products due to national security concerns.

2.3. Using OSS to build our own software

If we use $oftware tools and libraries to develop our own software infrastructure, we will be wedded to a specific vendor for the life of the software (which can be many years - Cobol anyone?). This entails yearly fees and a subserviant relationship with that vendor once we have deployed anything developed with its products. So developing things with Microsoft products, Cold Fusion, Oracle, etc (some of which are very good, hence the attraction), while it may look like a good idea at the time, is the very definition of vendor lock-in.

OSS provides many of the same tools and libraries, in many cases superior, or of equal quality. OSS has so many software development tools that it becomes a question of How many complete Integrated Development Environments do we need?. Only when there is compelling need and a written evaluation that describe exceptional conditions should we consider such $software tools going forward.

This approach is echoed in one of the proposals from the UC Committee on Finance. It does not explicitly state a preference for OSS (nor should it) but it is entirely about efficiency - how fast, how scalable, how costly, how cost-effective a solution is.

Even in the end-user desktop arena, OSS applications are increasingly competitive with proprietary ones. And contrary to many expectations, you don’t need to be running an OSS Operating System such as Linux, BSD, or OpenSolaris to use OSS applications. The most popular OSS applications are [those used on MS Windows] (although freedom from most MS products should be the long-term goal).

2.4. Linux is a better platform for research.

More digital data is being collected faster every day. The Large Hadron Collider will collect more than a GB/s during experiments. Social networking data, clickstreams, and network data is nearly unimaginable. Satellite data is streaming down at TB/hr. Even data from personal lab machines like a genome sequencer is overwhelming - the processed output from an Illumina High Throughput Sequencer can be 100s of GB per day. And you can’t analyze it with Excel. Or rather you can, but it would make you want to chew off your fingers and use the bleeding stumps to gouge your own eyes out.

A university that wants its students to be facile in the ability to analyze modern amounts of data will assure that they have access to, and learn how to use Linux and its data utilities.

There are good reasons why 92% of the world’s top supercomputers run Linux - the remaining 7% run some version of the similar BSD/Unix; only 1% runs Windows (and these are largely funded directly by Microsoft). This is the market share that matters for research.

2.5. Since we’re acting as sales agents, pay us

Why are we spending significant FTEs essentially acting as sales people for $oftware vendors when there are Open Source alternatives that are quite comparable? The $oftware vendors should actually be paying US for what we are doing - introducing new students to their products so that they’ll be uncomplaining customers for post-graduate life. We should be enlightening students as to their software options, not acting as Enablers for Microsoft. If $oftware vendors want their $oftware on our campuses, they should PAY US for the opportunity, not the reverse.

(Incidentally, OSS bypasses the whole issue of license servers, EULAs, copy-accounting, etc.)

It will not be possible to end the use of $oftware, but the default should be to consider OSS and WHY we need to pay for $oftware, rather than to thoughtlessly cut a PO for yet another round of unnecessary contractual subservience.

3. Other issues

3.1. Why are we not moving away from Windows?

Both Russia and China have released statements that they plan to remove Windows from their state computers over the next 4 years for security reasons. Also Google. And now that Windows has contributed to a major national security catastrophe for Iran, probably they will as well. UC, on the other hand, continue to pay for this notably insecure software and then pay extra for the anti-malware software that supposedly protects us from that insecurity. This is nothing but expensive comedy.

3.2. If Windows, use as much Open Source as possible

When we continue to use Windows, we should use as much Open Source and Free software as possible. There are certainly some cases where it may not be possible to rid ourselves completely of Windows, but in those cases, we should be be encouraging the use of OSS on Windows. There is a significant pool of such software and since most new software for UC’s infrastructure is Web-based, the underlying platform shouldn’t matter much.

3.3. Consider replacing hardware to save energy and space.

Besides the cost savings of OSS, appropriate hardware can save considerable energy and space, obviating the need to construct much more expensive data center space. 1U (1.75"), 48-core nodes are available for ~$7K. Two such machines can replace an entire rack (44Us) of 4 year old machines, are significantly faster, and use 1/10th the energy. Such 48core machines can probably host a hundred Windows virtual servers (but see immdiately above about the use of Windows in general). A single such machine, configured appropriately, could probably supply ALL of UC Irvine’s web services, for example.

3.4. Document, document, document

We need to document why we make decisions and how those decisions turn out. We need to make those documents public to assist other campuses in making the right decisions & avoiding expensive wrong ones. At the very least, not writing down your reasons for signing off on a multi-year, multi-hundred thousand dollar committment is bad management, not to mention an abrogation of responsibility, and a loss of paid-for institutional knowledge.

So let’s encourage people to:

Share success and failure stories. Failures are just as important as success.
Share expertise.
Don’t wait for a meeting to push out a hint how to save money. Post it to a public list.
Crowdsource expertise, benchmarks, approaches, HOWTOs.
Write things down and then write them up (better informally than not at all).
Use a system like UCLA’s Knowledgebase or your own (OSS) wiki.

3.5. Evaluate for what you need, not what you might need in a decade.

While it’s certainly good to plan ahead, be rational about it. Encourage the use of the minimum software for a job rather than the maximum. Few people actually need to use even MS Word to transmit information. What they really need is a simple HTML or text editor. We need to be exchanging information, not proprietary software headers.

When evaluating infrastructure software, concentrate on the the features we need, not measuring the length of feature lists. More code leads to more bugs, more cost, more configuration, more to read, more to fail.

Incidentally, OSS can be fully evaluated without legal review, RFPs, cutting POs, etc. Since it’s Open Source, the full version (not a time-limited or crippled version) is available to evaluate to see if it is appropriate. At the very least, using the OSS version allows you see what a $oftware version has to exceed to be worth paying for.

I’d be happy to suggest Open Source alternatives. Departments and Schools who don’t want to use Open Source should be welcome to continue to use $oftware and should be welcome to support it out of their own pockets. If Google, Amazon, Kmart, Target, IBM, Apple, Oracle, Cisco, and even Ernie Ball (the guitar string manufacturer) can be profitable and save money using OSS, why can’t we?

Not all Open Source is equal (or even good).

Many people (even many people in IT) have an initial idea that if software is labelled Open Source, it will be of very high quality, are dismayed when it is not, and thereafter label all OSS as junk. Just as there is a large variance in the quality of research, even in peer-reviewed journals, there can be a very large variance in the quality of OSS software. I have written some methods to determine what is good quality software, and others, notably David Wheeler, have written more about this.

Open Source is not a magical panacea, but it is a tremendously valuable tool for making IT more efficient.

4. Objections to Open Source Software

I am aware that there are objections to using OSS. Some of those objections and partial rebuttals are listed below.

We don’t understand it. (Then I respectfully submit that we’re hiring the wrong people.)
It will take too much effort to support. (Ditto.)
There’s no support for it. (There often is. This is how many OSS companies make their income, among them Linux (RedHat, Ubuntu, others), Lucene (Lucid), MySQL (Oracle, others), PostgreSQL (many), Apache (many), OpenOffice (many), Trac (many), etc. However, this objection also ignores the dramatic change in IT support that search engines like Google have brought about).
Our users aren’t capable of using it. (This is simply incorrect. If they’re capable of being productive using Windows, they’re capable of being productive using Linux).
It was written by a hacker/cracker/kid who doesn’t understand enterprise computing. (This has to be examined on a case-by case basis. It actually is often the case with new or lightly used applications. People have to be trained how to evaluate OSS.)
Comparing OSS vs Proprietary SW
Comparison of open source and closed source
Open Source-onomics: Examining some pseudo-economic arguments about Open Source
Open source: arguments for and against
A thread about this from StackOverflow

5. OSS Use Case Analysis: Military, Business, and Municipal

6. Related screeds on this topic

a Resolution for University Support of Open Software and Standards by Dr. Charlie Zender, ESS, UC Irvine.
How to evaluate Open Source Software and proprietary software for that matter.
Teach people to fix their IT problems themselves
See how easy it is to set up a thin client system
Manipulating Data on Linux
Mind your Negabits
Ten smart Linux office apps to try out
Open source is not just for Linux: 14 apps that are great for Windows users

Please feel free to disseminate this. The most recent version should be available here.