Welcome to the kriha.org weblog

What's New

Blind signatures - a way to solve e-voting problems?

here a discussion of several protocols for secure voting is needed. Please stay tuned.

Trust - the right attitude towards voting terminals and central servers?

It has been said many times: at the core of all security problems is always a trust relation. The way the Diebold system under discussion has been designed requires a lot of trust to be put into client and server side processing of votes. The system itself is a black box as far as voters are concerned: They have no chance of verifying that their votes have been counted and of course no way to determine that their votes have been PROPERLY counted. Why the destinction? The library of congress report mentioned below talks at the end of e-voting alternatives about new crypto-based voting mechanisms which could provide verification of different degrees. What kind of verifications would we like and how could those verifications make trust relations unnecessary? (It is important to note that the current paper-based system also includes a number of trust relations but tries to control those relations through mutual checks)

a voter should be able to verify that her vote was properly counted
the local election place might want to verify that their results where properly aggregated at election central
Election central needs to verify that results from local places are properly aggregated
Election central needs to verify that votes are not changed and that no artificial votes are added to the results.

Are the latter requirements necessary if we can guarantee full end-to-end verification through voters? Yes, they are because a voter can only verify her OWN vote and nobody elses.

The first requirement is a hard one: we are talking end-to-end verification across distributed systems with the additional constraints of anonymity being guaranteed. Eric Fisher mentions the following idea using cryptography: A voter sees her ballot with her votes filled in on a screen and additionally an encrypted result string. This string will be printed and handed over to the voter. A day later newspapers (or more likely: web sites) post lists with those encrypted strings and the voter can check that her string is also listed. Fisher points out the neither the publishers nor the voter herself or any readers know HOW she voted to prevent the buying of votes. And the fact that the keys for the encrypted votes are generated by election central does not improve the voters anonymity because the client facing systems then must guarantee that the voters identity is unknown to the place where the encryption is done. But last one seems to be a solvable problem.

This mechanism of course also prevents a real verification by the voter. I currently do not know a workaround for this problem because once a voter can verify a result a buyer of votes can do so as well. But on the other hand the mechanism places a whole lot of trust on the election processing instances. Right now those instances are based on mutual control of paper ballots, aggregated results etc. With a complete Diebold system in place much of those controls are indirectly performed through software running within black boxes.

And more: how would a voter control that the printed encrypted version of her vote REALLY matched her online vote? There is still a way for the machine to cheat the voter. So we still didn't manage to eliminate the trust that has to be put into the system.

The trust relationships between local and central election places could be reduced by using signatures at both sides. Local places could then verify that their results had been counted properly and they could also check the results from other places. This could also prevent later additions to the results because this would require new signatures.

All in all the trust needed in the Diebold system is too high. The proposed solutions based on cryptography point into the right direction. Interestingly, they no longer try to simulate the current paper based voting sytem (as the Diebold terminal and most other e-voting solutions do) but try to renovate the whole process.

Voter verifiable audit trail - why do e-voting companies and officials fight this idea so much?

During our discussion of e-voting we have benefitted repeatedly from comparing the problem at hand with similiar problems in other areas. I would like to use the same approach here by comparing the voting process with shopping in a supermarket. Shopping nowadays involves quite a lot of computing machinery. But one thing has prevailed until now: Once you are done with shopping you must pay and as a result you will receive a receipt which lists your goods as well as what you have been charged. And sometime you might notice something wrong with the receipt and chances are that the mistake will be corrected.

I have yet to find somebody who would claim that the software in the point-of-sale registers and the personnel driving them are so secure that a paper receipt is just an unnecessary waste of resources. But here we are of course talking important things like salad and fries, apples and vinegar and so on.

So why do companies and officials fight the idea of a paper receipt for a vote so much? Before I delve into this question we need to analyse the role the voter verifiable audit trail would play in an election. This role has two parts. One part allows a voter to immediately and without additional technical interfaces to verify her vote. This is equivalent to the paper ballot before it is thrown into the ballot box. We have seen that this character of immediacy is not present in electronic voting devices because the representation of the voters will could be manipulated in various ways. But the second part of the role is even more important: The voting receipt allows a crosscheck of the voting equipment. The receipts are collected and at the end of the day - perhaps as a result of a random drawing - certain election places are required to do a manual counting of the votes and the results will be compared to those of the e-voting equipment. In the regular paper voting process this is done in every case but in an e-voting system it is done only as an additional verification of the e-voting process and equipment.

This does not sound too complicated or artificial - so why is does it raise so much fighting? Let's have a look at the arguments against the voter verifiable audit trail (which by now must be recognized as a partial misnomer because the voter verification aspect is only one part of its role). The main argument is that the introduction of printers leads to a fragile voting process due to printer failures. Others claim high additional costs for declining a paper audit trail.

Are those arguments really convincing in the light of the supermarket example from above? It does not look like supermarkets have a hard time printing paper receipts and I also didn't hear about bancruptcy cases due to the high costs of supermarket printers. Quite the contrary: those printers are cheap, extremely reliable and the people already know the format (Why does the supermarket example suddenly seem to be an even more perfect match to the voting process? (;-))

Anyway, the result seems clear to me: there are no real reasons to NOT hand out a paper receipt in e-voting. One of reasons for resistance against it might be that it allows the computerized voting equipment to be checked for correctness. And it looks like this is already too much of an attack against e-voting systems and software. Are they only considered save as long as they are unchallenged? This is close to the way the international ice skating organization went with its voting process: Make it completely anonymous so that even blunt manipulations will go through unquestioned. (Again an exampel taken from Bruce Schneiers Cryptogram). If companies selling e-voting equipment would really trust their systems they would not object against that additional verification step. Resistance against it looks VERY suspicious.

Ballot design for e-voting

The design of online ballots is quite difficult. It is well known that the placement of parties or people on ballots has a major impact on the votes they receive. Many countries solve this problem with the rule that the position is determined by the number of votes received in the last election. But is this rule still fair if the screen does not show the lower part of the list AT ALL because one has to scroll down to see them. It is also well known from web-usability studies that many users don't even recognize the scroll bar and good web design avoids scrolled pages if possible. But I have seen ballots like the one used for Freiburg's community elections which is printed as a very long piece of paper. In case of e-voting I presume Freiburg would have to get rid of some parties (;-). But that is not all: The Freiburg elections also feature an extremely complicated set of rules for voting. One can mix and match almost anything (I know - I have been involved in getting the results through an OCR based system in 1994) and frankly I don't know how this could be represented in an easy enough way in a browser window without causing a lot of problems. The dynamic HTML needed here would probably break every browser - and remember - we don't want to install software. Again, an applet could help here.

But the first problem of course is the representation of the parties and persons in a way that does not exclude computer-illiterates from voting. The next problem is e.g. how to transport confirmation back to the user.

Do people cheat with online train tickets?

Conductors used to ask for your "Bahncard" only once per travel in each direction (sometimes only one time at all) and they used a special tool which left a hole in your online ticket indicating that the carriers bahncard had been checked already. This was convenient for travellers because they usually sit on their bahncard (;-)

But the Deusche Bahn seems to have changed this policy and now every time a conductor asks for your online ticket your bahncard is checked as well. So I asked a lady who asked for my bahncard (after seeing that it already had THREE holes) whether the bahn had noticed cheating with online tickets.

If you are interested in how the system might work, take a look at my lecture on internet security. There is one session where I tried to model bahn.de with my students. As the system works right now it is true that only the online check establishes proof of purchase. The marks left by conductors are not safe enough and could be forged. And they are not machine readable.

How could the bahn improve customer convenience and still come to a safe system? They could e.g. use barcodes printed on the online ticket and scan those code for the online check. But this would not allow customer identification and the ticket is usually bound to a bahncard.

When I did the study with my students I noticed the emphasis the bahn AG did put behind their bahncard system and at that time I thought that they used it as a single point of identification - assuming that the bahn AG like every other corporation has an established chaos of differnt databases with no single way to identify a customer. This may still be one of the reasons.

But the increased frequency of checks allows another interpretation as well: Bahn AG discovered the online identity checks as a perfect means to establish personalized travel statistics - usable on an individual or aggregate level e.g. to improve service (or raise prices).

To me it shows that the slowing down of the economy did not slow down the hunger of corporations for private customer data. And the bahn AG is not the only one.

Today I changed 150 Swiss francs to Euros. Guess what: The lady asked for my passport. Is this to prevent against fake money? Is there really a reason to ask for my identity credentials simply to change money?

It could be professional paranoia slowly getting the better of me but still, why is everybody after my private data?

E-voting alternatives: which system could work? (Part three of a study on e-voting systems)

This part will take a look at several possible architectures for e-voting systems. We will start with voting@home - a system that uses home equipment to perform electronic voting. Another alternative is to install e-voting equipment at official voting places (just like voting is currently done except for the "e" part of it). Public kiosk systems installed at supermarkets could be another choice. And finally optical systems which do not change the current paper based voting process but speed up the results by scanning and OCRing the ballots aftwards need to be mentioned.

For every system we will have to compare the security quality to those of regular paper based voting. We will get help in our analysis be comparing the systems also with other practices like absentee (postal) voting, e-banking and the use of special hardware based systems (smart cards etc.)

The first architecture tries to maximize voter convenience by using equipment from your home: a PC equipped with either windows or linux (We need to include some non-windows equipment because otherwise we might get some political problems). This leaves quite a number of software options: Should we require a fat client for voting or will a browser do? How will the voter express her will? How do we capture this will and how is it transmitted? But first: How do we allow a home user to perform a vote?

Authentication and authorization of home voters

The rules for voting requires us to ensure the fairness and openness of the voting process: Only entitled voter must be able to vote and they cannot vote several times (at least not leaving multiple votes). The first idea was to use a similiar mechanism as is in place today: registered voters receive a letter which contains a card (the voting card which has to be taken to the voting place). In our vote@home case the letter would only contain a token. This token has been generated at a token factory. The token was put into the letter and no connection between letter address and receiver was established (blind packaging). At home, the voter will use the browser GUI to enter the token and place a vote (we will discuss this further down).

This mechanism shows a number of security problems. The only good quality would be that it absolutely ensures anonymity if the token packaging process ensures anonymity as well. On the downside we loose every chance to revoke a token e.g. if a voter complains that no letter was received. We need to send a new token but we cannot invalidate the existing (presumably lost) token - which could be used to make several votes by the same person, thus breaking our election principles. Another problem is that ANYBODY getting such a token (e.g. taking in out of your mailbox) could use it to vote.

Obviously this process needs improvements. The first proposal suggested the use of an additional piece of identity, e.g. a passport number. But those numbers are not exactly a secret either and could be known by several people. Another suggestion was to split the voting credentials into two part - following the scheme used when EC-bank cards are sent: first comes the card and then comes the access pin. Both cases would also increase out anonymity problem and would requires additional measures at election central. But worse: both proposals do not help in case of attempted election fraud by placing several votes by the same voter.

It looks like we have to authenticate the voter when a vote is place to prevent double votes (in this case we do not need to know the token-voter association) - or we need to store the association between voter and token at election central to invalidate the token if it is claimed to be lost. In both cases we realize that the anonymity must now be guaranteed at election central - a major shift from the paper voting process which performs authentication locally and keeps it separate from the voting process itself.

But my colleague Roland Schmitz reminded me that pretty much the same problem is posed by digital money and that the technique of "blind signatures" might be able to help us out.

Capturing the voters will. Let's assume the voter knows what she will vote. How do we capture this decision? This involves the GUI (ballot design) as well as the process of voter input and last but not least how the vote is transmitted to election central. Let's not focus on e-ballot design yet - this is worth an extra chapter. Let's just focus on vote input and transmission. Input of a vote seems straightforward enough until you start questioning some assumptions: Do screen content and stored vote really match? What if a trojan horse displays the proper ballot but sends something different? Does the voter have a means to control what has been sent to election central? After pondering over this problem for a while we came to the conclusion that there is no technical guarantee for the voter that her vote really made it to election central. The trojan could always intercept vote and response. At this point it became clear why Bruce Schneier carefully distinguishes between a regular signature on paper and a digital signature. The difference is anything else than an unimportant. The paper signature is proof that the signer has seen (immediately) what she signed. (Yes, there are tricks against this like covering the paper to sign with a different paper and leaving the place for the signature uncovered.) But if the signer can raise reasonable doubt that she has really seen the paper the signature is usually called in error and does not count.A digital signature only is proof that a certain private key has been used. Not more. And there is a long list of possibilities that the key has been used in a wrong way. The key could have been stolen or the content to sign was not the one that was shown at the display to the signer. One technical way to ensure that the content signed was the content seen is to use a tamper proof card-reader with display and a tamper proof smart card containing the secret key and encryption algorithms.

So the browser based solution is vulnerable for trojan horse attacks (and possible client oriented DOS attacks as well - even tough a DOS attack might be much more effective if run against the server side of this client/server type solution). Extending the PC with a card reader and smart card is a major change in architecture and price of the solution. But it would allow the use of client certificates as well - thereby solving the authentication problem.

In which way would certificates at the client side help our solution to be more secure? We have seen that except for the card reader we can't solve the problem of capturing the voters will in case of a trojan. Certificates installed at the browser side would not improve the situation but they might be able to help with a different problem: vote transmission from client to server. Without a client certificate the server cannot authenticate the client. SSL is in this case susceptible for certain man-in-the-middle attacks. But let's first take a look at the server side. The server certificate is either a public certificate (verisign etc.) or it is a self-signed certificate. If the certificate is self-signed the browser would need the root certificate of the server installed. Otherwise voters would get the message that the browser cannot verify the server certificate because the certificate authority which signed the server certificate is unknown. (Remember: the root certificates (which contain the public key of the certificate authority) of public CA's come installed with the browser because the safe transport of those root certificates is of utter importance for the whole verification chain.

In any case: the server must set the proper cipher specs: it must be configured to accept only high-value encryption protocols and reject all others. Remember: without this configuration a base64 encoding could be used for "encryption" in an SSL connection and there would be NO confidentiality or integrity.

In case of a self-signed server certificate we could send the corresponding root certificate (together with the client certificate) in advance to the voter, together with the token. But NO, we don't need a token anymore, do we? If we require a client certificate for our connection to the server we immediately know the client identity at the server side and can get rid of the token. But wait, this scheme has several weak points. Sending a certificate to the client would either require that the client sends her public key in advance to us (so we can sign it with our root certificate) or we generated the private key for the client and attach it to the letter. This make the whole public/private key security go down the drain because now we know the clients private key and the private key is reduced to a password like device.

Installation of client side certificates would pose an interesting problem as well: Where do you store the different certificates if there is only one PC in the household but several voters? Do you require a certificate per voter? Or do you only want to secure the communication channel better by using one certificate per machine? One certificate per voter needs all these certificates installed in different browser profiles. Now you can explain the browser profile mechanism and installation process to your voters - not a very pleasing thought. And server generated client certificates do not really improve communication security much.

Server side authentication has another problem hidden: fake internet hosts which fake the election central site. Let's assume the letter to the voters also contains an URL of the type "http://www.bundestagswahl04.de" and voters need to point the browser to this URL. You could supply a piece of software which does exactly that - but then you are into sofware installation problems as well: what happens if some letters are faked and voters receive trojans or viruses instead? Don't forget: everybody knows that those letters will be sent to voters. But if users need to point the browser to the voting site by themselves there is always a danger of them mistyping the URL (or getting wrong URL information in the first place). Most users do NOT check the server certificate for correctnes of the host name and could be tricked into voting on a faked site.

A first conclusion on our browser based voting idea could be like this: we don't want to install anything but in this case we cannot prove that the voters will has been really captured and sent to us. The whole process could be much easier if we could identify voter and token in advance. Then we could work like many e-banking solutions today work. The idea of card reader etc. is tempting but expensive. A signed applet could be a good compromise. But I'd like to discuss software installation problems in the context of the kiosk or election place architectures below. Here they are even more critical.

And we still need to discuss GUI design (ballot design) before we move on to other architectures for e-voting.

Example 1. Vote@home using a PC with browser

Here goes our example with some diagrams


How do you cheat in a paper voting system? (Part two of a study on e-voting systems)

Before we dive into how an e-voting system could work it pays to take a look at how we perform a vote nowadays. The goal of this part is to establish voting principles for all voting systems and create a policy to measure e-voting systems against. I took much of the following from an article on the australian e-voting system and another article from Australia which questions the need and feasibility of e-voting today. (TODO: get the urls here). And I've discussed the issue with my students, establishing a threat model and an attack tree during this process.

Let me give you the results right away: The current paper based voting system has a lot of clever principles which makes it hard to manipulate votes. But mostly it is the mutual control of different party members which does the trick. And the "absentee (postal) vote is a much weaker form of voting with respect to the guiding principles and could form a model for e-voting. The core principles of our voting system are as follows:

Separation of authentication and voting with a crosscheck of votes against voters using the election cards. There cannot be more votes than election cards collected. The means of authentication vary (passport, personal acquaintance etc.)
Paper ballots are an immediate feedback to the voter with no other technical means between the voter and her expressed will.
A token (the voting card) prevents repeated voting by one person
Voters receive no receipt to avoid fraud (e.g. buying of votes). But they have received immediate feedback through their paper ballot.
A chain of independent observers with different interests (party memberships) guarantee that votes are correcly countend and reported to central election offices.
In case of problems votes can be counted again locally and centrally.
Availability is guaranteed through several voting places per town and free entrance is guaranteed by the state.

The downside of this organization are the considerable costs which make frequent voting e.g. more direct voter decisions an expensive procedure. The time needed to calculate the final result is tolerable.

Expressed as policies such a voting system guarantees free, independent, fair, secure and traceable elections

How can we improve the analysis of the paper based voting process? There are two security methods which can help here. The first is the creation of a threat model with associated risc calculation and ordering. The second is the creation of an attack tree which captures security threats over the whole lifetime of a product or process.

A threat model can be a simple table which captures possible attackers, the attack, the difficulty they face, the risk they have to take, their gain (or motivation, the consequences for the system under investigation and perhaps an estimate on the frequency of attacks. From these parameters one can make a rough risk assessment per attack. Do not expect a formula here because the variables involved are mostly qualitative. But at least one can create a list of possible attacks ordered by risk. The order comes through the application of simple rules like: the higher the gain, consequences and the lower the risk, difficulty the higher the overall risk would be. An advantage of such a simple list is that one can focus initially on the prevention of those attacks that carry the highest risk.

Table 1. A threat model

Nr.AttackAttackerDifficultyGainRisk (attacker)Consequences (system)Frequency expectedRisk level
1Break anonymityLocal election helperE.g. ordering the ballot papers in the ballot box and matching it with the order voters did vote. HighSmall because hard to sell data (Low)High personal risk (High)Votes not invalid but voters privacy violated. (Medium)Frequency (Low)Level: low to medium

Rows would be created for all kinds of possible attachs like marked paper, replacing ballots, cheating with results etc. After this exercise we have a much better understanding of the election process and its risks.

The attack tree lets you create a tree-like structure with your product or process at the top. The first level below the top starts breaking product or process down into different parts representing different steps in the lifecycle. A product e.g. might start with IC's being produced in a chip factory. A product gets assembled from those parts in another factory. The product is sold through a channel. Later it is installed and configured. It is administrated and used. And finally - don't forget this step - it should be destroyed without leaking critical information. At any point in this lifecycle attackers can perform an attack. Those attacks and who might perfrom them will be written down on lower levels of the tree, thereby refining our knowledge about the product/process and possible attacks. In my opinion the most valuable feature of an attack tree is that it forces us to watch the lifecycle of a product or process. In the case of elections we realized e.g. that the paper used for ballots could be secretly and marked (invisible) to break the anonymity of voters. The threat model though can put this threat into a realistic assessment.

Do you want to e-vote today? (Part one of a study on e-voting systems)

I've just started a couple of sessions in internet security on the topic of e-voting - shamelessly riding on the wave of e-voting pro's and con's in the US which culminated in Diebold's lost source code being investigated by A.Rubin of xxu university and his colleagues and students. evoting.org wired magazine collect most of the arguments and papers.

The whole event is one lucky strike for a security person: one can learn a lot about IT-Security but even more about how states, politicians and companies treat the so called highest value of western democracies - the voters expressed will - different from day to day IT-stuff: Not at all! The same "cost-conscience" that prevents security or privacy in your regular IT appliance prevents it also in e-voting: It's about cost and nothing else. (We'll talk about other goals below).And when caught, the same tatics are used: lies, misinformation and damage control that does not even shy away from using the Digital Millennium Copyright Act as an instrument to censor students.

But let's start with a discussion of the events in proper order:

  1. The US company Diebold sells e-voting equipment to states in the US, e.g. Maryland

  2. The source code (or major parts of it) for those embedded control devices is stored on a ftp server and then downloaded from unknown persons.

  3. A.Rubin and colleagues conduct an investigation of the source code and detect major security weaknesses. A report is published which finds a large audience, partly because of the events at the famouse florida vote which made G.W.Bush president.

  4. In a response Diebold rejects most findings, mostly on the ground of additional "checks and balances" which mitigate the vulnerabilities in their eyes.

  5. Rubin et.al. write an answer to the response and call their original report correct.

  6. The state of Maryland calls for a security assessment, performed by SAIC, a US-defense contractor.

  7. The SAIC report confirms several of the vulnerabilities. The SAIC report itself is published only partially with many critical sections suppressed.

There are some hidden forces as well: There is a political movement in the US which would like to see the e-voting systems use so called "voter verifiable audit trails" - basically printouts which show a voter how she voted and which could be used to conduct an audit of the voting process as well. Most e-voting companies and state organizations are not really keen to implement this feature and claim problems with printers as the reason. Pro-audit trail people think that without those receipts votes could be manipulated arbitrarily.

The (in)famous florida vote has split the political parties into supporters of voter verifiable audit trails (only Democrats) and opponents (Republicans) which feel the whole discussion is only a revenge for Florida. Democrats on the other side do not feel comfortable with the idea that companies which openly support the Republicans also run the elections without any kind of paper based control mechanism.

The stage is set for the drama to unroll. We will only highlight some scenes:

"The vulnerabilities do not matter because they are covered by regulations and procedures."

This is a proven tactic: Mistakes are no mistakes because they did not cause harm. Diebold uses this tactic repeatedly in its response to Rubin et.al.

"The goal is to standardize voting and to improve accessibility for voters."

A sentence from the SAIC report which starts the section on requirements. The first part is simply stating that its about saving costs (what else could standardizing mean in this context) and the second part tries to give something for the main victim (and the one who needs top pay the bill) of the whole idea: the voter should get better accessibility. Have you had any problems lately with voting? Or in other words: is saving some bucks (iff at all) really worth risking a political nightmare: loss in voting outcome?

SAIC - a defense contractor

Why did it have to be exactly a defense contractor conducting the security assessment for a state who just sank 60m dollar into this system? Where is the company getting their money from? This is not an argument against the SAIC report. Using a defense contractor and then not publishing the complete reports is just not a very convincing step on the State of Marylands side.

Missing: voter verifiable audit trail and a real

Critics pointed out that the SAIC report carefully tries not to step out of the investigative scope set by the State of Maryland. Not a word on whether a voter verifiable audit trail would improve security a lot. Just looking and commenting on the Diebold system within its limits.

Diebold giving up on censorship

The e-voting terminal vendor Diebold finally got back its sense and realized that they where handling the whole situation poorly by using the DMCA to suppress discussions e.g. about the fact that the Microsoft Access database used had no password protection installed. According to emails from employees at Diebold this was supposed to "make updates easier". Hopefully Diebold also realized that they where not selling apples or cleenex but things that could affect the live of everybody severely and that a public discussion of such a system should NEVER be prevented through legal tricks.

The aussies seem to be able to do better: both technically in with respect to the political evaluation of internet voting. According to a wired article the Australian government has also given a contract to a software company to develop an e-voting system. But the contract forces the company to make the source code public. A clever move which not only saves the state a lot of money but also allows independent specialists to make an assessment

Last but not least an analysis on e-voting systems: CRS Report on Electronic Voting. I found it in Bruce Schneiers latest Cryptogram, together with a lot of other good links.

How to design APIs

Recently students asked me about API design hints and tips. The case in question were so called bluetooth wands which are effectively 3-D locator devices just wireless. These wands need to be integrated and controlled using Java. They send events containing position data and the status of two buttons. The students tought they should use an event-like API (like mouse events) and they also wanted to reduce the number of events and the granularity of the position data. But it was unclear what an application would really need.

This is a very typical design problem and everybody who does system level work has been in this position already. The problem can be solved in a number of steps (actually, a more systematic presentation of the problem domain and its dimensions would be required but ...)

  1. Try to find existing APIs which do similiar things (in this case we were looking at mouse API and the swing event model). In the case of the bluetooth wands: look at other 3-D locator APIs and how they are used from Java. Can you treat your device just like any other locator or do you need special functions as well? Always try to at least simulate existing interfaces so that applications can use your device right away.

  2. Find some very good APIs and try to understand their design trade-offs and decisions. In this case I recommend the lessons learned from designing the SGI Video APIs by Chris Pirazzi Spend some time at his site because you will learn a lot about design and video specifically. Chris discusses the trade-off between information hiding and necessary control executed by applications. And he makes it clear that certain applications have a need for the raw data/events and no intermediate software layer will be able to do the right things for those applications. The lesson learned is e.g. that a raw and low level interface may be a bit more cumbersome but at least it allows applications to do what they have to do. Don't try to do too much.

    I also recommend Viswanathan and Liang, Java Virtual Machine Profiler Interface (IBM Systems Journal Vol 39 No 1 2000). This API shows how an application gets exactly the information it needs through an event-like API. No drowning in unnecessary information, no undue load on the VM to produce this information.

  3. Read the paper from Butler Lampson, Hints for Computer System Design. He discusses a lot of design pitfalls like overgeneralization and trying to do too much.

  4. Try to imagine a lot of different applications but be aware that you will never be able to catch all possible uses of your interface in advance. That means prepare for this with a configuration interface that allows applications to say what they want. But don't overdo it: Chris Pirazzi makes it very clear that most applications will not be able to deal with new things provided by a system level API. They are written against one device and will simply not work with anything else. So wasting a lot of effort on dynamic properties is probably not very good.

  5. Start to layer your API into one that allows raw access and one that provides convenience functions. Make sure the higher one uses the lower one or you will inconsistencies if one application uses the lower and on the upper layers. Decide on whether you will allow applications to use both layers concurrently.

  6. Decide on concurrent use NOW! If you don't prepare for concurrent use from the beginning your applications will not be able to use threads later. This covers the use of global variables, handles etc. Make you API reentrant from scratch.

  7. Decide on access control NOW! How many applications or threads will be allowed? Does it make sense to use e.g. a locator device concurrently? How about a sound card?

  8. Realize that you will not make it right in the first go. Plan for more releases by using API versions and - if needed - design a protocoll upgrade and extension scheme which allows future releases to offer new functionality without breaking existing uses and clients.

  9. Be consistent! for a bad example from EJB design see http://weblogs.java.net/pub/wlg/1026 which shows how confusing it is if the same method in different bean types has totally different semantics.

Software Engineering examples shown at sd&m on Monday, 21. October 2003 in Stuttgart, more

For students/postgrad's etc.: sd&m invites you to an event on software engineering. Between 10.00 and 14.00 several software projects (global ordering for DaimlerChrysler, Car-PC for Robert Bosch AG, eBusiness for Heidelberger Druckmaschinen) are being presented and there is ample room for discussion at the buffet. Please contact Jutta Lanz or Kerstin Booch (jutta.lanzl@sdm.de or 0711 78324-101) for more information.

Server consolidation on mainframews

From a business point of view the server consolidation is certainly a hot topic. But it is tied intimately to e.g J2EE technology, virtual machine concepts, Linux on the zOS and last but not least load-balancing, distributed caching etc. Questions to ask are: how much of this works today? Which virtual machine concepts exist and which ones make the most out of Linux applications? When will Linux apps benefit fully from the underlying mainframe services? How fault-tolerant are e.g. web applications running on Linux on the mainframe?

Note

This sounds like a high-end thesis as well: Software and hardware issues in server consolidation. Needs understanding of the J2EE architecture and how it separates contexts and concerns and also a deep understaning of virtual machines and the zOS. Let's ask IBM if they are interested here.

High-availability with J2EE

The FIZ Karlsruhe is looking for somebody interested in this topic. If you need further information on J2EE, take a look at the book on Websphere by Eric Herness, Rob High and others. I've worked with both authors and they provide a thorough understanding of the J2EE architecture. There is also a new redpiece on websphere clustering available from IBM Redbooks and last but not least look at the book "Load Balancing Servers, Fire Walls and Caches" by Chandra Kopparapu. The best you can find on how to load-balance while keeping session information etc.

Google Tricks and Information on Web Searching and archives

Tara Calishain, one of the two book authors is well known for her site on search tools, archives etc. Get her newsletter from Researchbuzz . The book itself improved the way I use google considerably, especially by teaching me the large number of google meta tags (insite, inurl, link etc.) Advanced searchers will probably also profit from going through the perl scripts to automate certain procedures (be careful, google does not like automated searches). The google API is explained and - surprise - RESTful approaches are discussed as well. I got the book (as well as all the others I present on my site) for our library. Enjoy!

Link Topologies and power outages: Barabasi's book "linked"

Link topology has always been a vital component of distributed systems. With peer-to-peer systems becoming more and more popular I've made them part of my distributed systems lecture. The theoretical foundations come from Alber-Laslo Barabasi, the famous mathematician. Barabasi discovered the far from random characteristics of many networks, e.g. the WWW. He found that many networks exhibit so called super-hubs: nodes that continue to acquire more and more links and that serve as central connecting stations. Those types of networks are called "scale free" because they show no gaussian distribution. They also exhibit interesting features like a general everyday robustness paired with a fatal tendency to collapse one a certain number of those super-hubs are destroyed. Barabasi detected the same type of connectivity also in the typical power networks all over the world and derives their vulnerability for large scale outages from it.

Looks like we got lots of confirmations during the last weeks with outages in England, Italy etc.

But it gets even more worrying: A virus has a certain kind of power to make new victims sick. For a long time only the rate of new victims divided by the number of people which came in contact was considered to be important. But this formula forgets about the overall chances for a virus to make contact - which is defined not the least by the type of network the virus lives in. In scale free networks with super hubs those hubs make sure that even viruses with a relatively low power to create new victims will spread like crazy: they have all the chances in the world to make new contacts!

Perhaps that is one of the reasons why SARS made the authorities so nervous. SARS by itself is not such a dangerous or powerful virus - but our airports could spread it all over the world within hours.

But Barabasi also shows how much information can be gained from link analysis. So people discovered that the web is split into many different areas with very different connectivity patterns. The core zone (all sites well connected), the in zone (private sites pointing to the core zone without backlinks), the out zone (company sites receiving links from core and in but not linking back. And isolated islands not connected to the rest of the web.

What if you had access to a large companies intranet with all the link information? Just by tracing the connectivity patterns and comparing them with the organizational structure of the company one could derive conclusions about the degree of interactions and synergies achieved etc. How could the knowledge about link structures improve company internal searches for information? Those searches are often unsuccessful because e.g. the page-ranking done by the google algorithms is not performed.

Usability and Security

Try to order tickets for a concert from SWR3 . SWR3 is a large radio station which caters to the young listeners mostly. When you go through all the web pages you will notice the following security and usability problems:

Lot's of personal information requested without privacy statement or protection through a trusted connection
Only one half of a page is shown. If you scroll down (and nothing really forces you to do so) I will discover something at the very end of the page: a button to request a SSL connection for your private data.
When you try to use the https connection your browser will complain about an unknown certificate authority used by the web server. When you check it it turns out to be a self-signed certificate and nothing you could really check
And when you think the worst is over it really istn't: After accepting the private certificate a new dialog box is opened and your browser tells you that the server mentioned in the certificate is not the one you are going to connect to.

A nice example how usability and security deficiencies mix.

I am looking forward to the new thesis and I will keep you informed about what is happening in this area.

Note

I am planning a workshop on usability, especially in connection with security and I would like to get your input.

Usability in embedded control

Web usability e.g. has gotten into focus a long time ago, thanks to Jakub Nielsen and others. Friends of mine did a very interesting study on usability in e-government. But usability in embedded control applications still seems to be a scientific orphan. Even though we have to deal with more and more devices every day.

Example 2. The lost pictures

When my daughter tried to erase some pictures from an almost full 128MB compact flash card from our new Pentax GS330 digital camera she erased all pictures. No "undo" possible of course. After trying to calm down my wife and coming to a shaky ceasefire within the family I tried to find out how it happened and discovered a usability problem with the camera. Let me explain how one deletes pictures with the camera. First you press a DELETE button with a trashcan icon on it. This gives you a menu with two entries: CANCEL (highlighted) and ERASE. Using a toggle switch you navigate up from CANCEL to ERASE and push a button. But which button? There is the DELETE button and there is the general OK button in the middle of the navigation switch. The consequences are dire: If you press the OK button now you will erase the selected picture. So far so good. But if you press the DELETE button (which you used to start the delete process) again, ALL PICTURES WILL BE DELETED without further questions and of course no undo ...

I almost could not believe this but it turned out to be true. Luckily I stopped using the memory card immediately and at home I was able to restore the pictures with the help of Greenspun's photo site which pointed me to Kurt Stege's JPGDUMP utility.

But I noticed that I was by far not the only person having this problem. People lost pictures with all kinds of digital cameras thanks to stupid user interfaces.


Note

I am planning a workshop on usability, especially in security and embedded control areas but also covering e-government and web design. I you have some input I'd be more than happy to invite you to this workshop.

A systematic look at the art of deception

Mitnick lists a couple of mechanisms how social engineers take advantage of human nature:

the tendency to comply with requests from authorities
The tendency to help people which one likes
The tendency to give something if you got something (reciprocation)
the tendency to be consistent in actions (even if the first actions have been based on shaky ground)
the tendency to accept people vouching for others (social validation or the transitivity of trust)
The tendency to fall for greed (scarcity)

Kevin Mitnick, the art of deception, pg. 246

What makes these mechanisms work? Perhaps we need to take a look at the "Art" in deception. Why does Mitnick use the term here? My guess is because the "art" in the way deception works in his stories is not based on simple lying. The deceivers surely tell lies but those lies DO NOT MAKE THE TRICK. The lies are only there to make the victims jump to conclusions and it's the conclusions that will lead to fatal mistakes. The deceiver never mentions the conclusions but leads the victims to make those conclusions by themselves.

What is the vehicle that is used in those deceptions? In most cases it is language, but it could be the style of the clothes you are wearing, your car etc. as well. Language and behavior are based on structures in our society: Syntax and social structures. Which means a sentence is not an arbitrary statement. It has logical preconditions to make it acceptable. And behavior has social preconditions attached to it. The problem is that these logical or social precondition need not be true.

Example 3. Wrong accusations

If you hear the sentence "why are you late again?" you will rightfully conclude that the person being spoken to has been late already at least once. You will jump to this conclusion NO MATTER WHETHER THE SENTENCE ITSELF IS TRUE.The structural logic of sentences in natural language makes you do it. And it requires an active act of thinking to realize that the accusation might not be true.


Example 4. The big car

Expensive cars are one of the oldest means of deception. Expensive cars structurally imply a good income etc. in most societies. Again, it takes an active act to realize that cars might be borrowed, leased etc. But many people first conclude that the driver is a wealthy person.


What is the scientific background of this behavior? Perhaps structural analysis can help us to explain how this works. In the early eighties Michael Titzmann wrote his book on "Strukturale Textanalyse" (the structural analysis of texts) and used structural analysis to analyse all kinds of literature: from poetry to political speeches. Boiled down to the basics his methodology simply uncovered the things a text did NOT say. The theory is that a text forms a structure of statements which needs to be consistent to be acceptible. The problem is that a text need not express all those statements. Many statements (we can call them preconditions or conclusions) are never mentioned expressively in a text. A text contains therefore "structural holes" which can be filled by somebody analysing the text. And if the holes are filled the whole textbecomes visible - with conclusions and preconditions now being visible.

I don't remember if Titzman also applied his methodology to social areas (like the car example) but there is no reason why it should not work there as well.

I've used this technique in my thesis work on the content analysis of movies and the results where quite promising. And with Mitnick's stories I can use the same methodology to analyse why people get tricked. We seem to have this "interpretation engine" in our head which forces us to conclusions permanently. Unfortunately it needs this active act of thinking (you can call it awareness) to realize this effect. And worse: To realize the permanent interpretation in our brain one needs to look over ones own shoulder WHILE working, talking and so on. So Mitnick is right when he says that only awareness will prevent you from being tricked - no amount of technology will help you here. And this awareness is hard to learn. Hard as understanding the hidden arguments in todays political texts.

Back to Mitnick. His book contains mostly tricks played through the telephone but if you convert them to modern technology (email etc.) the effects get even worse. This is no coincidence. While personal talk leaves the victim at least with a chance to verify a few more parameters (language, clothing, general appearence etc.) mail leaves the receiver only with the ominous "from" and "to" tags. So jumping to conclusions is even easier and worms like "Klez" have shown how socially devastating the use of address book entries for a mail sender address can be. People who never sent a certain mail got cursed by victims of Klez

So Mitnick's book is confirmation for me to continue treating internet security as an awareness problem in the first place.

Linux Certification - should we create a non-commercial one?

Title says it all: Do we need a Linux Certification process for academia? I am currently using the IBM sponsored Linux system administration tutorials in exercises for my lecture on operating systems. Much to my surprise the students really appreciated the self-studying material and even suggested to take the certification tests. This made me look for certification procedures at universities and much to my surprise - could not find even one. The Linux Professional Institute (LPI) offers certifications for several linux areas in connection with those tutorials. So far so good but LPI uses two commercial companies (VUE and something I forgot) to run the tests. Actually, these companies have written general purpose software to perform class testing and this software is needed by all companies which want to conduct those certification tests.

I have a number of problems with this approach:

  1. The costs for being tested are steep for students (150 Euro per test)

  2. A lot of windows based software needs to be installed - for a Linux certificate!!!. Don't know about the costs of this software.

Our university (and that should be the case for all universities) has all the equipment and know how needed to perform certification testing as well. The only thing missing is a Linux Certificate plan and process which of course should be open source. Tests could be taken for free or a nominal fee at participating universities. The classes and tests could be organized in three different categories:

LK1 to LKn for kernel certificates
LD1 to LDn for development classes
LS1 to LSn for system admin classes
LX1 to LXn for special classes

Most Linux literature is free anyway and some excellent tutorials and classes already exist. The classes should be independent of Linux distributions.

WI-FI, P2P and Democracy

A couple of different disruptive technologies seem to converge at this moment: WI-FI as the possible liberator from the telco monopolies of infrastructure. Peer-to-peer technology as the organizational means for the masses (see Richard Koman's article about Howard Rheingold, "The next revolution: smart mobs" . But something is missing in all those liberating revolutions based on new technology: direct democracy as a disruptive and liberating form of political management. The last one is rarely heard about and needs some further explanations.

The first victim of the current gulf war wasn't the truth - there never was any before. The real victim was our current understanding of western democracy as a way for the people to control political power. This has been demonstrated an ideology: In many countries 90% of the population were against this war but the respective governments still decided to support it. This in turn put e.g. the Swiss system of direct democracy into a new light: Such behavior of the governments would not be possible in Switzerland.

But the Swiss system is believed inappropriate for larger countries by some parties. Parties are seen as the filters for public opinion, the means a public meaning is formed. So are the media. Both, political parties as well as the media have shown in the last couple of month that they are far from fulfilling their roles.

There is an interesting equivalence between the delegation of rights in western democracies and the delegation of rights in operating systems. After handing in their vote, voters have no control whatsoever about the execution of their rights - just like starting a program on a standard operating system. All the user can do is to hope that the program isn't a virus or a trojan horse because by starting it the user has handed over her complete set of rights to the program.

The obvious gap between public opinion and political power has not been a political topic so far. What could change this?

This is where WI-FI and P2P come in. Both technologies have a flair of "freedom" and participation attached to them. With WI-FI a worldwide net of privately held communication networks seems possible. You cannot "embed" thousands of web-cams reporting wireless through peer-to-peer channels. Completely "Out-Of-Control" in Kevin Kelly's sense and of the big state organizations. Just imagine the security woes for those. Large monopolies are easily forced to invade privacy on a large scale. This is much harder on many small, distributed networks.

What technical ingredients are still missing to create the global WI-FI net? It requires a strong end-to-end security. People will not likely share their access points if they have to fear that other users will invade their infrastructure. It will also require a strong form of quality of service control. Sharing will not happen if the owners service level (which she pays for) is degraded by freeriders. And we will need fuel cell technology - WI-FI eats energy like my nothing. But the technical means of a worldwide net are not enough. It takes organization as well. Organization starts with names and addresses and a way to meet peers and form groups. This is provided by peer-to-peer technology e.g. Sun's JXTA framework.

There are some vague candidates as well. Webservices and GRID technology. Webservices were also dubbed "disruptive" but honestly - the only webservice like thing that I know is the public Google API. And you have to pay for it. GRIDs are dynamically allocated computing resources which solve problems without location dependence. Sounds similiar to the touch of "freedom" in P2P and WI-Fi. But GRIDs are also somehow connected to large clusters, mainframe technology, scientific computing etc. Nothing really hot in political terms. But if we imagine GRIDs more in the sense of seti@home then we end up with a global pool of computing resources. And mixing WI-FI and P2P into this we have the dream of open source come true: free and permanent connectivity, easy grass-roots organization and a dynamic and large pools of computing resources always available - for free? How will this change our political systems? Currently the old powers (military, oil) are in charge but the seeds for a new round of disruptive revolutions have been planted.

We sure do live in interesting times. And I would just love to go to O'Reilly's emerging technology conference in my old town Santa Clara. emerging technology conference

Computer Organization for application programmers (see sidebar)

I am preparing a lecture on operating systems and one of the frequently asked questions is: how deep into hardware do you go? John Carpinelli's book on Computer Systems Organization and Architecture deals mostly with hardware issues: how to design a CPU, a complete computer etc. but he does this through very simple examples so that even application programmers will be able to understand it. It is NOT a book on system programming but you will benefit from such a book much more if you've read Carpinelli's book before. I liked the fact that he starts with finite state machines - a nice contrast to how we software people usually design out programs. The example computer architecture explains memory and I/O subsystems and how they work. Caches, memory management etc. are also implemented. Again - not a system programming book but one that shows you how to develop hardware.

Another thing that I really liked are his simulators. Java applets let you simulate two different CPUs, a PAL, a complete computer etc. It is fun to assemble a small program, put it in memory and see the CPU running the program.

I will use this book to explain how e.g. a byte is written to an I/O device, how interrupts work and so on. It will provide the base for more system and application programming later on. But I want the students to understand concepts like clock driven systems, extension buses, caches etc. so they will be able to recognize them as general principles not only tied to hardware design in later lectures.

And of course we will run the simulators. Together with other simulators from simjava , e.g. a cache simulator suporting associative lookup, n-way set-associative caches or directly mapped caches.

WEBDAV: how to design an internet protocol

Design experience is hard to find. I found an excellent source at the webdav.org portal site. It is Yaron's WebDAV Book of Why under Miscellania and it reads wonderful. What are the consequences of offering a "depth" parameter in collection operations? Who would have thought right away about possible denial of service attacks using depth. It is dangerous to design an internet protocol where a request is almost free to the client but can cause a lot of work on the server side. I made the same experience in a portal project where the personalized portal homepage caused extreme resource consumption on the server side.

A protocol connects clients and servers. How do you design the protocol so that server errors are visible AS SERVER ERRORS on the client side? Otherwise the companies writing clients will get all the service calls. I would not have thought about this pragmatic problem without reading Yarons bedtime stories.

I believe we should design class interfaces in OO much more like protocols. One way to achieve this is to change the way we teach OO: Do pair-programming with one student writing the client and the other the server and let them find out about the contractual status of an interface.

Another important topic: Be careful with what goes in the body of an http request. The http header is used not only for fast dispatching on receivers side. Many intermediates (firewalls, proxies etc.) will also act on header information. Putting functions into the body of an http POST request runs the danger of re-implementing a lot of http. This is called "tunneling" of http and is one of the most criticized design aspects of SOAP. The RESTafarians believe that SOAP just duplicates a lot of http functionality rendering the web into incompatible services.

WEBDAV decided to use a connectionless protocol. Versioning will allow client and server workspaces and seems to follow good old CVS practices here. Requiring permanent connections between clients and repositories is a bad idea.

Last but not least Yaron argues for KISS principles: avoid complex operations if you don't understand the side-effects. Do not mix aggregate-requests with simple requests. If a result can be achieved through a sequence of simple requests - don't invent a convenience function for it. Some extra requests don't hurt, especially when you think about the DOS aspect from above. Go for "if you don't know a tag, ignore it and keep on going". This principle allows upward compatible protocol extensions.

There is much more to say about WEBDAV, e.g. why it needs to go into the OS kernel but I will write a complete article on WEBDAV anyway...

Computers for morons, operating systems for livestock

Every once in a while I like to pick something from my stack of "articles to read or re-read". David Gelernters essay "The second coming - a manifesto" fits nicely to my current work on preparing an operating systems class - hopefully not for livestock (;-). Gelernter, the father of Linda and tuple spaces, knows how to be provocative. He says that the current operating systems are very old and do not support humans in their work properly. He mentions the file metaphor and says it is from programmers to programmers. Computers are NOT filing cabinets because they can be active. His example is very interesting: The file metaphor forces us to classify information UP FRONT before storing it. Gelernter says it should be a computers job to properly store and classify things. The human memory works associative by content based lookups: the classifacation by the mind happens at retrieval time, not storage time.

Another interesting statement by Gelernter is about "lifestreams". Everything we do electronically is ordered not hierarchically into several more or less fitting categorizations. Instead, it is put at the top of our electronical livestream. We can navigate along the time axis and focus on specific documents. Focussing means asking the computer to do a categorization or set selection in realtime.

And last but not least these lifestreams are close to our current weblogs. With the little difference that my computer does not automatically copy a new blog entry into several other categories I decided to have. E.G. if I talk about a book this entry should equally show up in my bookmark page and not only in the blog. Gelernters computer would detect this automatically. Here one can see Gelernters history of tuple-spaces: The computer would "trap" events (like me writing this blog entry) and take action automatically. This requires a shared information space like e.g. a tuple space.

Good scripts and an operating system class

Operating systems are not exactly en vogue nowadays (besides linux of course). There is a certain canonical literature whose most prominent example is probably the book Modern Operating Systems by Andrew Tanenbaum. But what if your needs are a little bit different? E.g. you need a class for students in computer science and media and you know that you won't turn them into kernel hackers (at least not all of them). See what we plan in our What we do in Operating Systems article. The problem is e.g. that we need to introduce C but from a Java backgroung. Surprise - there is a good script and tutorial on the web: C for java programmers by Jason Maassen. Get it with other excellent tutorials from 1001tutorials.com or - equally good from Intelinfo . And for linux an excellent resource exists with the Linux Documentation Project .

Ever wondered why english computer books are so much more popular than german ones? No need to wonder anymore - got to Howstuffworks and see how somebody e.g. explains how the programming language C, the PC Bios etc. work. These authors seem to have only one goal: get the audience to understand the topic - not to shine in front of an academic audience. At least I tell my students when they have to hold a talk about their thesis etc. that it is their colleagues they are doing this for and that they should always - in writing or oral presentations - develop a user/receiver model: what would the audience like to read/hear?

Sometimes we can't do it (or not?): Computability, Turing, Eliza and Peter Wegners Interaction Machines

I started writing on this "trend" but it got too big for the blog. Find the small article here

How web-browsers could help protect our privacy

Is it just me or are there more and more people concerned about the way companies handle our privacy? The browser is THE main interface for most people to web and internet. So what could be done to improve the situation?

Here is the list of small but effective changes that could be implemented:

  • Convert html-mail to text. Why is there no button to select this option? The reason is simply to allow companies to track their (spam) mails and our behavior through web-bugs.

  • Suppress image references leaving the current domain. Should be at least an option that would prevent doubleclick, extremetracking and others from creating profiles of our web activities.

  • Allow only temporary cookies. Simulate persistent cookies but delete them at the end of a browser session. Right now one has to delete all persistent cookies and then make the cookie file read-only.

  • Suppress those pesky pop-up windows. This is now offered at least by mozilla. But why stop there: Do not allow javascript to open new windows on different domains. Do not allow scripts in image references etc. At least turn those critical functions into capabilities (see below) which let ME decide if I want to grant them to the browser.

I am sure there are more possibilities (allow javascript but not active X scripting which normally would prevent you from watching pdf files!) but that even those simple ones are unavailable keeps me thinking...

The german computer magazine C't (1/2003) had an interesting article from Matthias Withopf and Axel Kossel about how to put Internet Explorer on a leash. They created a configurable startup shell around IE which prevents unwanted COM objects from being loaded. Just like a "program personal firwall" (see capabilities below). Nice idea but what makes me uncomfortable is that this little shell (IEController) needs to run as root - sorry, "Administrator" on those boxes to be able to patch IE. But the authors are perfectly right in saying that this simple measure could have been implemented ages ago by Microsoft. Again, clear signs that there really is NO INTEREST AT ALL in your security or privacy.

Back to the mainframes

I started writing on this "trend" but it got too big for the blog. Find the small article here

Collaborative Spam Filtering

Just a crazy idea but could it work? I heard a lot about statistical processes lately (Bayesian networks etc.) to fight SPAM but nothing about collaborative methods. Those have been proven extremely successfull as a suggestion tool e.g. by amazon.com , compared with content analysis techniques at least. How about creating fingerprints of deleted spam messages and sending them to spam-registering servers. Mail clients could create fingerprints of new messages and compare them with those already stored on those servers. Of course spammers can make subtle changes but the fingerprinting system could become clever as well. It's all hares and turtles anyway.

Capabilities and "E"

I've been to a very good talk on capabilities today. The two students presented their work with the new operating system "eros" and its programming language "E". Fascinating stuff, e.g. the promises concept of weak references that can be used to make progress in a concurrent application without the need to fall back to - usually error prone thread programming. The concept of "sturdy" references comes close to what e.g. Java Data Objects call a "hollow" reference. This is a reference to an object that has just participated in a transaction and got stored in a persistence layer somewhere. But the reference - basically the name is still in memory to allow re-loading etc. A sturdy reference is one that has no live connection to its real object but still can be used and forwarded. Only those method calls that would require back-to-back access to the real object don't work. So we've got promises and sturdy references to make progress without locking. Only when we absolutely need the evaluated result we must use a wait instruction which forces the system to evaluate all dangling computations.

What if we give up on the idea of "live" references to remote objects and try to live with what "E" calls "promises? Messages to promises are put into a kind of worklist for later resolution. We need to restructure our programs in parts that should work with promises and sections where we need to wait for results. We have talked a lot about transparency in distributed systems (the famous Waldo paper e.g.). We tried to find abstractions to hide - or better to make manageable for programmers - the remoteness of distributed objects. Promises try to abstract the concurrency part of distributed programs without losing asynchrony. Synchronous calls keep programming easy at a high performance price. Asynchronous calls create a forest of threads currently and are hard to program. Promises do away with threads at the price of a changed programming model.

This is not really related to security. But "E" finally clarifies a lot of myths on capabilities. Capabilities are digital keys that a program must use to call functions on some object. Unlike ACL they are under control of the owner and therfore can easily be forwarded. Ok, this means on the other hand that forwarding in not under the control of the resource. But capabilities can be revoked as well. From a scalability and distribution kind of view capabilities are an ideal solution. They really work against viruses and trojans by not automatically running those with the authority of the principal. ACL based systems test the identity of the principal and then hand over all the principals rights to the running program - unrestricted.

In a way the concept is close to SPKI where not a name is bound to a key but e.g. a right. (See SPKI draft by Bill Frantz and others). Delegation is solved easily in both systems: SPKI build a chain of cerificates and the final receiver of this list can check throughout the whole chain that every issuer had the right to create new authorizations and delegate them.

Capablity based systems need to create a frame for all acivities by not letting someone get a capability uncontrolled. A capability is a right. This leads to design patterns like handing only facettet objects as parameters in function calls. One anti-pattern comes from Java where filebuffers are wrapped and wrapped but can be converted back downwards and then allow more rights then the wrappers would allow. Or when subclasses weaken superclass restrictions.

Finally: both SPKI and Capabilities in E use the concept of S-expressions heavily. Need to look it up. Seems to come from Lisp.

Security Architectures for Internet Applications

Christophe will give an introduction and then guide us through a larger example. Ample room for discussion and questions will be provided. The focus in not so much on the smallest technical details (like how to configure a reverse proxy) but on the role security decisions and tools play and how they relate to each other. Starting with a victitious internet company a security model will be developed (include e.g. a first cut risk analysis) and from there the technical architecture evolves.

A good introduction for the software frameworks behind internet security can be found here: Security challenges for Enterprise Java in an e-business environment, L Koved et.al,

How to get 4 slides on one page using xsl-fo

Over the holidays I've been toying around with XSL-FO formatting objects. My goal was to figure out a way to create pdf from my slides - with four slides per page using Norman Walsh's slide doctype. It was easy to create a two column version: xsl fo provides a column attribute mapped to a column-count parameter in the slide parameter section. But I was desperately looking for an attribute that would tell the fop processor to put two rows of two slides each on a page. Until I realized my mistake: There is no such parameter because the way one gets more pictures on a page is simply to increase the usable space on this page by adjusting several margin parameters. In other words to increase the body region of the page master template.

The flow object processor will then simply fill in text into the formatting objects until a page is full and then create the next page.

Finally I added page numbers to the slides using xsl:number instead of fo:page-number. I've described the way xsl-fo works in my latest lecture on document construction . An example of the new pdf style is the draft of the introduction to provessional development environments .

The space in "cyberspace"

A gift from our CEO Andreas, "The architecture of Intelligence" proved a worthwile reading over the X-mas holidays. Written by Derrick de Kerckhove it talks about the relation between architecture and cyberspace. About the mingling of virtual world and real world. It displays a lot of very interesting online and architectural projects even so I have to admit that I did not get everything on the first go. But it made me think of several things: Do we need spatial metaphors in online "spaces"? Does an e-portal gain anything by using spatial metaphors? Perhaps I am too much of an old Unix person to appreciate the latest GUI and VR developments - and didn't Neal Stephenson just publish a book on how the modern GUI basically disables users? Prevents them from acquiring real knowlede about computers and computing?

If the tuple (x,y,z,time) is not useful in cyberspace, what tuple is? The book emphasizes relations over and over and topic maps (with underlying xlink mechanisms) came immediately to my mind.

I looked at some of the websites run by architects or the Smithsonian without walls project but the navigation of those sites sometimes made it hard to understand what the sites offered.

Architecture had an enormous influence on software during the last 10 years. Christopher Alexanders design patterns are now common engineering knowledge. But it seems that computing now has a reverse impact on architecture: ubiquitous computing requires new spaces and new architectures.