Security and the Internet

    the party line

    At first glance, security may not seem like a design-related issue, and to be sure, part of this lecture is addressed towards helping you understand the ins and outs of security. Nonetheless, there are a number of security-related issues which you should be aware of, both as an internet user, and as a designer.

    The Internet is the equivalent of a global party line — it is by definition a shared network, where the vast majority of information is freely available, and communications, including standard web requests and e-mail, are readily visible to anyone who takes the time to look. One of the most frequently used protections is the user ID and password combination. The two most common ways that security is compromised in any electronic setting is 1) through users choosing bad passwords, passwords that are easily guessed by software programs; and 2) passwords that are left in easy to find files and locations. Choosing a good password isn't terribly difficult; it should be 6 - 10 characters in length, not be a standard dictionary word or common name, mix capital and lower case letters, have some non-letters as well, but still be memorable to you (so you don't have to write it down). Probably the best examples are personalized license plates that use creative letter and number combinations (i8NEWyrk, sK8t3rz, etc.).

    Security comes in many forms, but can reasonably be divided into three categories. Please note that I am not a security expert, and these definitions are intended to help you understand the terms and situations your are likely to encounter. In general terms, to encrypt something means to scramble the content in such a way as to prevent its unintended discovery. Encrypting something involves some kind of "key;" a typically large, random piece of data, which is plugged into a mathematical algorithm in order to encrypt the data. The strength of the encryption (its relative inability to be deciphered) is usually measured in terms of its key length, that is, the number of bits of data in the encrypting key. When we speak of strength, we are talking about the theoretical probability that encrypted data could be decrypted; in practical terms, this translates to the number of computer calculations necessary to discover the key and / or the original message directly. All forms of encryption in use today can theoretically be broken, but strongly and properly encrypted data would theoretically take (at today's present processing power) longer than the expected lifespan of the universe to decrypt. Which is pretty good.

    symmetric encryption

    Oscar Meyer decoder ring

    This type of encryption uses the same key to encrypt and decrypt data, and can be exceptionally strong (at present, I am not aware of either Blowfish or Triplefish being cracked). Also, symmetric algorithms tend to be extremely fast. Its primary disadvantage is that for two entities to exchange encrypted data, they must first have exchanged the key, and doing so is often not practical nor secure. As a result, it is frequently used on the internet when a key exchange is not required, such as encrypting incoming data that needs to be securely stored for transmission, but only recovered on the storage end — financial data, or account numbers. In other words, it is used as a one-way filter where only a single party needs to utilize the key.

    asymmetric encryption


    Asymmetric or public-private key encryption involves the use of two keys, one for encryption, and a different key for decryption, and is what you're using most of the time on the Internet. It is impossible (or at least statistically improbable) to obtain one key from the other. It's called public-private key precisely because one of the keys can be made public, allowing any entity to securely encrypt information for a particular recipient by using that recipient's public key. This kind of encryption is what is most frequently used to secure communication channels via the internet; for instance SSL (Secure Socket Layers) is widely used in web browsers today to provide a secure log-in to web site, particular financial web sites. SSL is a form of RSA security; the little padlock at the bottom of your screen indicates whether SSL is currently in use. RSA and Diffie-Hillman are mathematical algorithms widely used to provide secure asymmetric encryption.

    Web browsers today are capable of many levels of encryption, however, the two most common key lengths used are 40-bit and 128-bit (the latter is legally available only within the US). Each additional bit in the key length doubles the level of security; 41 bits are twice as strong as 40, 128 bits are 2^88 times stronger than 40-bit encryption. At the current level of processing power available to a modern super computer, 56-bit encryption can be cracked within months or weeks. 128-bit is widely regarded as extremely secure; the military uses something like 2096-bit encryption. The disadvantage in using more bits is that exponentially more power (and thus time) is required to encrypt and decrypt data.



    Technically, hashing is not a form on encryption, but it's similar enough and so widely used to warrant inclusion. Hashing is used to scramble data in such a way that it can never be recovered; the significant factor is that the same input data always produces the same unique hash, which can serve as a sort of fingerprint. By far the most common use of a hashing function is to store user passwords. The original password can never be recovered from the stored hash; instead, when a user enters a user id / password combination, that combination is hashed, and the resulting hash number (itself secure) can be transmitted to the remote server, and tested against what was originally entered. Quite specifically, a hashing function is used to guarantee that whomever (or whatever) is attempting to access a particular set of data is the same as whomever (or whatever) originally created the account.

    In other words, a hash is not used to transmit data, but to establish the identity of a user.


    being there

    When you browse the web, you're looking at copies of files — files that are stored (in most cases) an another computer, somewhere else on the Internet, which have been downloaded to your machine, and are stored in your cache. When you email someone, you're sending a file across the internet to someone's mail server, where it's stored for their retrieval. Your machine is receiving and sending files to view and be viewed, but the software that does all the work is running on your machine.

    The Internet offers much more than this however. It's also possible to view software on your machine that is actually running on someone else's. This mode of operation is typically called "client-server," where your machine is running a client that talks with server on another machine. Telnet is probably the mostly widely known client; if you've ever used used a telnet client, all the information you see inside the telnet window is being generated on a remote server. You might use telnet to configure remote software, check server statistics, or launch a server application.

    Telnet is one kind of program more generally called a shell. It's worth noting that telnet is insecure — it sends your user id and password in an unencrypted format across the internet. If anyone is listening, they have your user id and password merely by writing it down — any 15-year old with a computer can do it. The truth is that most of the time, no one is listening. But if you're accessing a bank server, for instance, you can't take that risk. The best solution available today is a protocol called SSH (Secure SHell), which encrypts transmissions before sending them, using any one of a variety of encryption formats (you get to choose). And as I mentioned above, SSL is another kind of protocol, used primarily between a web browser (a client) and a server to prevent just these kinds of problems. Even these solutions can be susceptible to the "man in the middle" attack. If security is a concern or your work is genuinely at risk, I suggest reading more in depth on security.

    It is also possible to run an application in such a way that it is distributed across two or more machines. The Java programming language is uniquely suited to this, and perhaps the best example is SETI@home — a screen saver that runs on your machine, and communicates with a central server running on another machine, which in turn communicates with 1,000's of machines around the world, all collectively running a single networked application (SETI is the Search for Extra Terrestrial Life, and SETI@home is considered by many to be, in effect, the world's largest super computer). Java does this by inserting a layer between itself and all the other software on your machine, a layer called a "sandbox." This layer helps make Java powerful while maintaining security.

    Generally, the Internet is a far more wonderful than it is a fearful place, but every time you connect your machine to the internet, there are some risks, and it's important to be aware of them. On a project or in an office, lack of security, and lack of understanding about security can produce disastrous results.


    which one is which?

    One of the things that makes information so powerful is its ability to be copied over and over again. It's virtually free to distribute as many working copies of something as you like. When authoring a document, a web site, or a software application, this same power can create enormous problems. Sometimes it's difficult to distinguish which particular copy of a file is the "original", even on your own drive, and the problem becomes exponentially more complex when many people are involved in the creation of a document or file.

    The golden rule is always keep a master copy — one original from which all other copies are derived. It is a common mistake to to overwrite a more recent copy of a file with an older copy, or worse, to have to users make different and conflicting changes, and have no easy way to re-create a single master copy. One powerful but readily available concept in managing files and content is check-in check-out. As the name suggests, individuals can "check out" a copy of a particular file in order to work on in, much like checking out a book from the library. For the duration that a file is checked out, no other user has access to this file. Once work is finished, a file is checked back in, and the next person or team can check it out. Many development tools support this working mechanism, including Dreamweaver and Frontpage for the web (primarily used for HTML), and virtually every major software development tool.

    Another less well known but perhaps more powerful tool is CVS. Primarily used by software developers, particularly in the open source community, a CVS server allows many people to check out the same document, and work simultaneously on the same file. Upon check-in, a file is reviewed, and all concurrent changes are integrated into a single document. Conflicts are identified, and the user is offered an opportunity to resolve the conflict. CVS also allows complex software packages to be versioned by assigning major and minor numbers at agreed-upon milestones (such as "everything works today!"), and allows software to be "rolled back" to a previous version (in the event development takes an unexpected turn). Although CVS has been primarily used for coding, it is a powerful tool, and could potentially work effectively on any text-based design project (creating a corporate brochure for instance).



    From a design perspective, security affects the architecture of almost every aspect of network application. The impact can be fairly simple. For instance, providing log-in access to a user affects three things: 1) the web site, which must have a user-specific area; 2) the software, which must verify that a user is whom they say; and 3) the database, which must securely store and retrieve user ids and passwords (or their hash).

    As companies grow to understand and fully utilize the internet, more and more corporate business is conducted on and through network applications, all of which are typically interlinked. In addition to having an account, users also must be a member of or have a permissions group which defines what areas they can access, and if relevant, at what level of clearance.

    As illustrated at left, web users might have access to a large body of public data. Web developers have access to a development server, and may have the ability to affect some of what a user sees as they modify and upgrade software. However, they may not be able to modify public data (only content managers or site administrators might have that privilege). Database administrators share some access with developers, but none with the public web server. The LAN would allow some access to internal machines, but possibly not to the internet.

    Although designing permissions groups is in art in itself, the information designer must be concerned with how to structure available information so that only a particular user can only "see" a permissible group of files and folders. The Interface Designer must focus on how to present certain kinds of information to particular groups of users. There might be any number of web interfaces on a corporate site — one for the public, one for the client, one for the company at large, one for technical people and one for management. Of course, the larger the company, the more complex the design task.


    quis custodiet ipsos custodes?

    Sending and receiving data, accessing remote machines, designing complex group sites — how do you know if what you're doing is truly secure, or secure at all? How secure is secure, anyway? Who is "listening" out there, and should you even be worried? If you're lucky, a security breach is a nuisance — someone defaces a web site, or reads an email. The worst case scenario is that an entire computer or even a network is compromised, and everything is lost (which is why you should always have a backup of anything important) or stolen (which has happened with credit card databases more than once). But the first line of defense is you — using some basic safety procedures, and consulting a security expert when it's relevant are the best two things you can do.

    Probably the bottom line is this: if it needs to be secure, encrypt it. If you're uncertain about the security, check first. Never email anything you can't afford to give away (don't ever email user id's and passwords). You can encrypt browser sessions (SSL), email (via PGP software), shell sessions (via SSH), and data (using a hash or public-private key technology). As the saying goes, better safe than sorry.

    And as to who's listening, well — you'd be surprised. But that's entirely another course altogether.




all content © copyright 2003 neil verplank, unless otherwise stated