We've had a security day this term and I did a talk on robustness and what is wrong with our industry. My claim is, that we are abusing IT-Security to fill in for basic problems in software and languages. This is a big waste of time and money and doesn't help at all.
It was nice to see the next three speakers basically confirm all my statements by trying to improve an already hopeless situation. The first talk was about Vulnerability Scanning durch Graph Traversierung by, Paul Duplys, Bosch-Forschungszentrum Renningen
The goal was to detect input validation problems in C-code using a model checking approach. Technically very interesting, the talk showed also the huge effort necessary to even partially overcome problems of a defective language.
The next talk on DDOS with Mirai botnets showed again how hopeless C-code in embedded control environments really is. With the added twist, that the victims are neither the seller nor the buyer but totally unrelated third parties.
The last talk by Achim Krauss of Cybereason was the most interesting to me. He showed the use of machine learning to detect attacks (even very targeted ones) in large scale corporate networks. But - and this is a big but - this costs your privacy. The agent on all machines simply collects just about everything and the explanation was interesting: Clever malware waits until the laptop is in a 192.168. home environment to deliver its load. To detect this, data must be collected even when using the laptop at home.
This adds another dimension to my talk: It is no longer only about costs and time. It is about your privacy now as well. Instead of forcing the software and hardware companies to use safer technology, politics and corporations are now punishing the users for the defects. But there is no improvement visible on the horizon: The big corporations have their lobbyists in place (within the state too) to prevent any changes. There is simply too much money in bad safety nowadays...
In recent years the term "helicopter parents" has become a synonym for overprotective parents. Sadly, the academic world seems to follow this trend as well. Time to say a few words on what the results of an academic education should be and how to achieve this.
To me it is an absolute requirement that an academic education leads to an independent and responsible behavior with respect to ones own fate. This includes the organization of ones courses. Here comes the first catch: this can be achieved only, if there are options to choose from. .A curriculum that does not allow choices does not expect students to define interests, watch trends and make up their own plans.
Sometimes students complain about "lack of systematics" in my courses. Yes, I do jump between technical and non-technical topics sometimes. But I guess what they really mean is that I do not provide summaries at the end of a presentation. That sometimes there is no presentation at all. I believe that creating results and summaries is the job of those who listen to presentations and talks. It is an important mental process and it needs to be done by the participants.
On self-organisation. Google docs allows us to share notes easily. Is it too much to ask students to maintain a list of presentation topics and dates online in a Google document? I don't think so. But for some this comes as a surprise.
Especially in the master program, questions are far more important than answers. This irritates students in the beginning. The best written summary of some technical topic will get you only a "2". Because I expect you to think ahead, question current knowledge and find interesting new questions.
I would like to finish this quoting from Nassim Nicholas Taleb, Antifragile: Things That Gain from Disorder ““The biologist and intellectual E. O. Wilson was once asked what represented the most hindrance to the development of children; his answer was the soccer mom. He did not use the notion of the Procrustean bed, but he outlined it perfectly. His argument is that they repress children's natural biophilia, their love of living things. But the problem is more general; soccer moms try to eliminate the trial and error, the antifragility, from children's lives, move them away from the ecological and transform them into nerds working on preexisting (soccer-mom-compatible) maps of reality. Good students, but nerds--that is, they are like computers except slower. Further, they are now totally untrained to handle ambiguity. As a child of civil war, I disbelieve in structured learning . . . . Provided we have the right type of rigor, we need randomness, mess, adventures, uncertainty, self-discovery, near-traumatic episodes, all those things that make life worth living, compared to the structured, fake, and ineffective life of an empty-suit CEO with a preset schedule and an alarm clock.””
This is a clear statement against too much "process" and "competence" in academic education!
At the beginning of the term we had been discussing a research project on fake news, their production and especially their reception across media. What we learned from this project was, that we are lacking software tools to get the right data. And that it is far from easy to get the necessary data in an automated way, e.g. using bot software. The workshop should allow researchers of social media and social graphs to voice their needs. We are going to demonstrate what we learned up to this point by building and using a social bot.
|What are bots? What are social bots?|
|Is automated collection of data with bots legal?|
|Where and how are bots used?|
|What are the risks of using bots?|
|How to (not) get banned by social sites?|
|Presentation of our bot (Twitter, Instagram)|
|How do I recognize bots?|
|Results, Q&A and further research ideas|
|Guide for participants|
Wednesday, 31 May 2017, Stuttgart Media University, 14.15 - 15.45. Room: tbd. I will post the room as soon as I have it confirmed.
I got a nice example of "deformation professionelle" this week while reading the weekly paper for our journal club at the university. The paper was Frank McSherry et al. on Scalability! But at what COST? where the authors establish a single core bottomline for establishing the efficiency of scalability. What? Isn't the shere achievment of scalability enough in and by itself one could ask? Especially somebody who spent the last 8plus years on investigating the numerous techniques for scalability used by Google, Facebook, Amazon an Co.Well, not if a multi-node implementation of a graph processing package shows nice scalability across many nodes but is trumped by a single core of a notebook! Even worse, some processing packages NEVER reached the single core results at all! In the words of the authors, they had unbounded COST. COST being defined as the number of cores/nodes needed to achieve the single core performance.
There are some important lessions behind those results: Scalability vs. Efficiency, Distribution vs. Centralization, Network vs. Local Storage. Let's begin with Scalability vs. Efficiency. I have simply forgotten the fact, that scalability alone does only mean, that we get more throughput when we add more processing nodes. This is what drives the request-heavy sites of Google and others. It says nothing on how much computing power we really need to do a certain form of processing. Let's take a look at those algorithms used for distributed graph processing. In many cases they follow what Google called "vertex centric thinking": The algorithm is written from the perspective of a vertex with its incoming and outgoing edges. Incoming data are processed (e.g. compared to the current state of the vertex) and an outgoing data value is transferred along the outgoing edges. If the edge crosses a machine border, a remote call is required.
The main architectural feature here is round based processing where changes are communicated until no further changes can be detected. Besides the obvious termination problem this architecture requires lots of inter-node communication and research tried to find node partitionings which minimized this overhead. The big advantage clealy lies in the size of the graphs which can be processed.
What if you are Google and own thousands of smaller machines under the control of the Borg cluster manager (now Kubernetes). You can easily partition jobs across those machines and without any additional work run clever algorithms. Distributed graph processing looks absolutely natural in this case. The alternative would be to buy some huge machines which scale vertically, like "The Machine" from HP.
So where is the real trade-off between network and local processing? Latency has always been a problem in networks and disks. But the new high speed SSDs offer enormous bandwidth and short access times together with vast amounts of storage. And not to forget: You can now put 1 Tera of RAM in a machine at a reasonable price. Let's just do some calculations using the facebook social graph. Given 2 billion users, we can estimate its size at roughly 4.5 terabyte. While this is not small, it fits easily on some SSDs and almost a quarter of it will fit into RAM. This should allow for efficient graph processing on one box only, without going distributed. I guess now I understand why Amazon now offers instances with up to 1.5 terabyte RAM and Microsoft does something similiar to achive vast amounts of memory. Perhaps HP is right with "memory centric computing" being the future. For large scale analytics this seems to be the case already.
Where is Stuttgart located? Closer to Detroit or closer to Silicon Valley or even beyond? Traditional Industries like automotive are facing disruption on a major scale. Stuttgard and the whole country are both deeply depending on those standard industries which are all going digital by now. If the industries proof themselves to be unable to adapt to those changes? Well, let's not think about this here, because the consequences would be desastrous.
But simply copying Silicon Valley technologies won't fit the bill: The industries need to go beyond that and come up with true inventions - possibly related to current abilities. But it will require major changes to the corporate culture. Silicon Valley companies invest heavily in conferences and talks simply to support the necessary networking required to get new ideas into production.
"Beyond Silicon Valley" is a series of talks centered around innovative companies which try to compete successfully with the digital giants of the valley. The talks will mostly be on a Wednesday or Thursday evening at Hdm. Unfortunately, no streaming and no recording until a permanent streaming infrastructure is installed. In the first talk in our series, M.Sc. Julian Weiß - IoT Ecosystem - Robert Bosch GmbH, and M.Sc. Dennis Grewe - Communication and Network Technologies - Robert Bosch GmbH, gave a talk on the Bosch cloud and IoT efforts: "BIC @HdM - Introduction to the Bosch IoT Cloud (BIC)" I will just comment on their agenda.
"Who is talking to you? " Both speakers are alumnis of the Computer Science and Media Master, HdM and they work at the new Bosch research facility in Renningen or at the cloud center in Feuerbach. " Why is Bosch building an IoT cloud?" According to the speakers, the Bosch strategy concerning the future is to connect each and every device, sensor or machine to the Internet. This requires a cloud data center to allow easy and fast deployment of new services to support those. They expect much shorter delivery and deployment times. " Can you provide an example of a Bosch cloud service?" They gave lots of examples actually, ranging from agriculture over connected cars to smart home, energy and many others. Companies want to trace their equipment at construction zones and protect against misuse and theft. Farmers need to watch out for freezing temparatures to protect their crop. " Car2X: Flexible IoT Data Acquisition from vehicles." A nice example that shows how many differnt connection technologies are being discussed. The cloud support could e.g. provide intelligent road-sign detection to cars without advanced sensors. Availability is a core feature in such an environment and needs to be supported by clever fallback modes in the disconnected case. " What are the BIC building blocks?" The Bosch cloud is actually a commercial solution from Pivotal. it offers the usual layers (IaaS, PaaS and SaaS) with software support from Bosch Innovations regarding security and other topics. There was some discussion about using a commercial solutions vs. building your own. But given the timescale, building was probably not a real option. " What are the next things?" Opening up the cloud for public use and bringing it into other countries (2018) are near term goals. Up to now, the Bosch cloud is only being used by Bosch departments. The speakers mentioned some problems with getting the different departments aligned behind the cloud and cloud solutions, especially regarding international service offers. "Sensors, Services and Software" ist the Bosch strategy even in the cloud and both speakers never tired of mentioning the many exciting things currently happening at Bosch. Engineers just take sensors outside and try something. There is without doubt a creative and inventive atmosphere at Bosch. Me, I am simply happy that there is now a large scale cloud installation in Germany, even if I would like to see it as a more general cloud. Bosch has its focus tightly on sensors and IoT and does not see itself as a competitor to Google or Amazon. All in all a very promising start for my little series on innovative companies in Stuttgart!
In case you thought, that computer science causes the biggest social disruptions: think again. Think about buying youself extra-clever children which come with the entry permission to university right at delivery. In case of failure: remember the no questions asked return policy for good clients at major online stores. Sounds far fetched. Better read some articles on transhumanism like Homo Deus: Gehirn-Upload, Unsterblichkeit, Künstliche Intelligenz by Stephan Schleim or learn how to easily change DNA with CRISPR . Here is a critical view.
Markus Turber, Roman Grasy and Tobias Schmuecker talked about transhumanism and 3D-printing of organs in my "aktuelle Themen" seminar. The room was packed and it was obvious, that the topic hit some nerve. The talks were a mixture of exciting new technology and the innovative context required to achieve results. Turber started with a few words on Intuity and the partnership with researchers at the Charite, Berlin. The task was to build a bio-printer for organic tissue. The company had never done something like this. There was not a single biologist in the team. Instead, web application and UI specialists dominated. But after a 120 days of experiments a working prototype of a 3D printer had been created. It is amazing to the the printhead move over several pots with organic material, dip into the pots and assemble small organic components by building layer after layer. The vascular system seems to be the key to build "living" components and the images showed arteries filled with blood inside the printed organic components.
On the invention side: we seem to have almost all necessary ingredients to build such a machine. They can be ordered from chemical or medical supply companies. But it takes a team of creative people to start assembling them.
There are endless opportunities hidden behind such a printer: the organic components can be used for pharmaceutical tests without live animals. They can be used to find the proper antibiotics or anti-virus drug in one go instead of going through long rows of sequential tests. They can be used to build more complex organs, even though this is not a reality yet. One very important lession learned: The result of a bio-printer is merely a lattice for biology to build on. In other words: the printer can only work in a way live and biology work too.
That last point is clearly left behind when they talked about CRISPR: the DNA technology that allows single DNA elements to be replaced by any other element. Together with a tightly controlled genetic line of a special fish, researchers can compare individual changes, detect the genes causing the differences and cut out the DNA sequences. The next steps would be to transfer those genes to other living creatures and by doing so e.g. make humans re-grow new teeth or create better eye-sight. This is where we are entering transhumanism with all its social, ethical and economical problems. We didn't discuss CRISPR in depth and suggested another talk on that subject later on.
I am not sure whether genetic manipulation and breeding are really that different. I have learned that breeding certain features requires exposing plants to all kinds of abuse to force mutations which hopefully turn out OK. But just cutting and pasting genes between living systems seems to be different: Those changes are on an individual scale while I have learned, that mutations frequently cause changes in several different places - like a system changing in major parts to achieve a new form of equilibrium. Well, surely something to ponder over in the next months.
Thinking about buying a little e-car for yourself? Or maybe something a bit bigger, like a Tesla? Well, if there weren't so many open questions like charging, repairs, prices, range problems etc. The state of Norway seems to be unperturbed and just announced that they would prohibit the sale of gas/diesel cars after 2015. . The first E-Mobility Day at HdM is going to tackle those questions with the support of Fraunhofer IAO and Vector Informatik.
We will dive into complicated charging issues, get an overview of current developments in e-mobility and have a discussion with the audience. And we will make a trip to our neighbour, Fraunhofer IAO, to take a look at their existing infrastructure. And if you want to know, whether e-mobility makes sense for a company even economically, you can talk to Tim Kost from BridgingIT. He comes with his Tesla - one of 20 that the company has and uses throughout Germany for its consultants.
And that is not all at this day: for the first time we will test out smartphone-based solution for live-streaming of events. Hopefully you will be able to use the URL for live participation in the event. But your browser needs to support WebRTC for this!
Agenda 13.45 Welcome, Prof. Walter Kriha 13.50 „EcoGuru – E-Mobilität am Fraunhofer IAO“ Kristian Lehmann M.Sc., Fraunhofer IAO Stuttgart (http://ecoguru.de) 14.15 Beginn Rundgang IAO Der Rundgang kann nur in kleineren Gruppen stattfinden. In der Zwischenzeit kann vor der HdM ein Tesla angeschaut werden. Tim Kost, Business Unit Manager & Senior Consultant bei BridgingIT wird dazu Rede und Antwort stehen. Seine Firma hat 20 Teslas im Einsatz. 15.45 Smart-Charging von Elektrofahrzeugen, Patrick Sommer, Vector Informatik 16.45 Diskussion mit dem Publikum zum Thema Elektromobilität 17.15 Ende der Veranstaltung
On 15th January we are going to have a workshop on concurrency and parallelism, together with the industry. Different programming language and dataflow concepts will be presented and discussed along a real case. A final discussion of the different approaches will hopefully increase our understanding of this complicated topic. We will also discuss patterns of applicability with respect to the different technologies.
Some of the ideas for the workshop follow the book "seven concurrency models in seven weeks" by Paul Butcher. and some were suggestions from the course.
Michel Zedler from Exxeta will be presenting a realistic case for parallelism. We will discuss the different technologies in the context of this case.
Clojure and Akka are well known for their agent-based architecture. And we should take a look at both. Vanessa Werner is currently working at Exxeta on Akka as part of her master thesis.
Hidden behind the rather arcane topic of "Communicating Sequential Processes", MP has been a core architectural feature of Erlang for many years. And not just in the telco industry. Large-scale sites use many Erlang programs too. The GO language made the concept even more popular. The Butcher book has a chapter on CSP with Clojure and we could use it as an intro for the concept.
State and updates of state are usually the most critical parts of concurrent systems. But it is rather hard to imagine a system, where you can have variables without the state problem. Clojure seems to separate identity from state using STM mechanisms. A good reason to look at how this works by reading the chapter in Paul Butchers book. There is also a chapter on STM in the Herlihy book if you need more background on the algorithms behind. BTW: what level of isolation is provided by the algorithm in the Herlihy book? is it sequential consistency? linearizability? some form of snapshot isolation?
This has not been covered in our course, but it gives us a good chance to better understand the differences between thread-level parallelism (our topic) and what can be done with different kinds of problems, e.g. massively parallel algorithms. The goal would be to clearly distinguish use-cases for both technologies.
Just about the hottest topic in Big Data and large-scale sites: real-time processing of data in pipelines. There is an interesting architectural pattern called "lambda architecture", (not to confuse with Amazon's new lambda architecture, which is equally exciting..) which allows both batch and real-time processing of data. It is worth to take a look at this and see, what we can do with this kind of system. Tyler Akkidau has released some articles on the topic recently: The world beyond batch: Streaming 101 A high-level tour of modern data-processing concepts. and The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in Massive-Scale, Unbounded, Out-of-Order Data Processing.
Friday 15. January 2016, 14.00 - 18.00 at HdM, room 056. A I am not sure about a live stream due to the workshop character of this event. It it is available, it can be found here hdm homepage.