Blog
-
5 years 6 months ago | ikalyvas“What’s the saying? ‘Every time a Targaryen is born the gods flip a coin‘“. - Cersei LannisterWith this phrase Cersei reminds us of the madness that runs in Targaryen blood and pre-announces what is to come. Following the last two episodes of Game Of Thrones last season we know which side the coin landed. The question is, could this madness be predicted? Was Daenerys’ rise to her Mad Queen title foreshadowed or was it an act of madness that came out of nowhere? Powered by the bias that hindsight knowledge offers, we will perform sentiment analysis on her 8 season script lines to find out if Daenerys had shown early signs of her erratic personality.[[{"fid":"429","view_mode":"default","fields":{"format":"default","field_file_image_alt_text[und][0][value]":"blog_got","field_file_image_title_text[und][0][value]":"blog_got","external_url":""},"link_text":null,"type":"media","field_deltas":{"1":{"format":"default","field_file_image_alt_text[und][0][value]":"blog_got","field_file_image_title_text[und][0][value]":"blog_got","external_url":""}},"attributes":{"alt":"blog_got","title":"blog_got","height":720,"width":1083,"class":"media-element file-default","data-delta":"1"}}]]Photo by King Siberia on UnsplashThe toolsAlong with the expectation to see if the outcome is actually aligned with the storytelling, we are mostly thrilled to put some cool toys in action.Our weapons of choice will be Apache Spark, ML.NET and MobiusCore, which is our .NET Core port for Mobius. Mobius is an open source library that provides bindings to Apache Spark through C# code created by Microsoft.ML.NETML.NET is Microsoft’s open source framework for machine learning. It allows you to create your own custom ML models and use them to perform Sentiment Analysis, Product Recommendation, Image Classification and all kinds of cool stuff. We will be using ML.NET to perform Sentiment Analysis on all of Daenerys scripts lines across all the episodes of the 8 seasons of Game of Thrones.Apache SparkApache Spark is a general-purpose distributed analytics engine for processing big amounts of data. It’s probably the most popular open source Big Data library. We will be using it to run our sentiment analysis tasks in parallel over a cluster.MobiusCoreWe are heavily invested in C# and we always wanted C# support for running Spark Jobs. As a result, when Mobius was created by Microsoft for the .NET platform we jumped to it. Same time, being huge .NET Core fans, we were hopping to get the same level of support there too. Certain implementation did not allow the Mobius team to target the .NET Core platform. Thus the idea of MobiusCore was created.MobiusCore is an open source port of Mobius for .NET Core. Mobius relied heavily in delegate serialization in order to allow user defined functions written in C# to be executed by Spark Workers. As a result it was difficult for Mobius to target the NET Core platform because delegate serialization was dropped by the .NET Core team (see discussions here and here). In comes MobiusCore. To provide the support we wanted, we replaced all Method Reference Delegates with lambdas expressed as LINQ Expressions. Although LINQ Expressions are not directly serializable they represent an expression tree. We can extract the required information from the LINQ Expression Tree, pass it to the Mobius Workers, reconstruct the lambda from the expression tree on the Worker and execute it. Tap dancing around the delegate serialization minefield, we were able to keep the same, fluent, task definition expressiveness bypassing the API differences while porting to .NET Core. If you are interested in getting some more details, you can find some here.In the meantime Microsoft released its own new library for Apache Spark bindings in .NET Core, .NET for Apache Spark which supports the Apache Spark Dataframe API and we are super-excited about it. We can’t wait to see where the efforts by the Microsoft Team will take us next. In fact, we are now looking into how we can assist in this great new tool!For our example we will be using MobiusCore to implement the algorithm that will perform Daenerys’ psychological evaluation.Psyche EvalIt’s time to sit Daenerys down to the examination couch. We have gathered all 8 Game of Thrones Season scripts from the Genious API and we trained our ML Model with the AFINN Lexicon (more info here). Now, the good thing is, the text is all English. Translating Dothraki would probably be a showstopper! At this point, doing the Sentiment Analysis is as simple as:It’s amazing what you can do when standing on the shoulders of giants.ResultsDear Mother of Dragons! The results will definitely surprise you. Daenerys Sentiment Analysis came out, and it scores a staggering 74.89% of negative lines. Although the results contain some false positives (or should I say false negatives?), Daenerys, certainly had her fair share of toxic moments!The top picks contain :Text: Daenerys Targaryen: He was no dragon. Fire cannot kill a dragon.Toxicity Prediction: Toxic sentiment | Probability of being toxic: 0.983727037906647Text: Daenerys Targaryen: Have you ever seen a dragon?Toxicity Prediction: Toxic sentiment | Probability of being toxic: 0.970722794532776Text: DAENERYS: I’m not a politician. I’m a queen.Toxicity Prediction: Toxic sentiment | Probability of being toxic: 0.968557298183441Text: DAENERYS: (speaks Valyrian) Unsullied! Slay the masters, slay the soldiers, slay every man who holds a whip, but harm no child. Strike the chains off every slave you see!Toxicity Prediction: Toxic sentiment | Probability of being toxic: 0.96771764755249Text: DAENERYS: I know what my father was. What he did. I know the Mad King earned his name.Toxicity Prediction: Toxic sentiment | Probability of being toxic: 0.966907918453217Text: DAENERYS: Jorah sent my secrets to Varys. For 20 years the spider oversaw the campaign to find and kill me.Toxicity Prediction: Toxic sentiment | Probability of being toxic: 0.965652883052826Text: DAENERYS: You’re a strange man.Toxicity Prediction: Toxic sentiment | Probability of being toxic: 0.960509717464447Text: DAENERYS: My enemies are in the Red Keep. What kind of a queen am I if I’m not willing to risk my life to fight them?Toxicity Prediction: Toxic sentiment | Probability of being toxic: 0.957431375980377Text: DAENERYS: I do not recognize this tradition.Toxicity Prediction: Toxic sentiment | Probability of being toxic: 0.955697119235992Text: DAENERYS: I am sorry you no longer have a father, but my treatment of the masters was no crime. You’d be wise to remember that.Toxicity Prediction: Toxic sentiment | Probability of being toxic: 0.947878360748291We can already see, from the top 10 negative lines picked, some toxic behaviors like the indifference she showed at her brothers death, the execution order she gave to her Unsullied and her lack of remorse when addressing the son of a master in Meereen, who she had recently killed.Now, a lot can be said about what a Dragon Queen’s expected level of toxicity ought to be or how far it would have taken her not to act or speak the way she did. One might also argue on how the warning to “harm no child” should lower the toxicity score, but she seemed to have forgotten about it as the show drew to its final episodes, so +12 points for the Queen of Ashes.ConclusionWith the power of hindsight, the aid of both our new and established toys, and with some good will and humor, we have made it official! It seems that Daenerys had Queen of the Ashes written all over her from the early beginning. We were all in favor of the Savior title bestowed on her, so we looked the other way whenever she showed glimpses of madness and we refused to believe that she would become her father’s daughter. At least, that is what ML.NET, Apache Spark and MobiusCore have concluded.What would it take to prevent the destruction of Kings Landing? The Westerosi would need a cluster running Apache Spark, some great open source libraries and a few lines of code… or just a lit more trust on Varys’ gut.PS: At CITE we have been actively engaged with Apache Spark for a long time and in parallel have been following through the progress and using all the goodies that .NET Core brings to the .NET enthusiasts as ourselves. So, we worked on MobiusCore as a way to spread the love. What a brave new world to be coding!
-
7 years 3 weeks ago | cite-adminOn 22 November 2017, 12:00pm CET BlueBRIDGE organised a webinar on "New Generation Tools for Aquaculture". The webinar focused on the BlueBRIDGE tools that aquaculture producers can use to estimate the performance of their production exploiting state of the art Machine Learning methods based on the real historical production data. Over 50 participants from around Europe attended the webinar.[[{"fid":"364","view_mode":"default","fields":{"format":"default","field_file_image_alt_text[und][0][value]":"Bluebridge webinar","field_file_image_title_text[und][0][value]":"Bluebridge webinar","external_url":""},"link_text":null,"type":"media","field_deltas":{"4":{"format":"default","field_file_image_alt_text[und][0][value]":"Bluebridge webinar","field_file_image_title_text[und][0][value]":"Bluebridge webinar","external_url":""}},"attributes":{"alt":"Bluebridge webinar","title":"Bluebridge webinar","height":1728,"width":3000,"style":"height: 674px; width: 1170px;","class":"media-element file-default","data-delta":"4"}}]]Webinar descriptionAquaculture is the fastest growing animal food production sector in the world with continuously and rapid increase global production. However, the environment in which the aquaculture companies operate is highly competitive with limited margin for profit. All aquaculture producers have to face specific challenges concerning the improvement of the performance of their companies in terms of cost, feed conversion, growth rate and mortality. Simultaneously, their decisions should be sustainable and environmental friendly. Small mistakes can make the difference from profit to loss. Using the services provided by BlueBRIDGE, aquafarmers can estimate the performance of their production exploiting state of the art Machine Learning methods based on the real historical production data. Furthermore, they are able to make accurate production plans, future investment plans by exploiting the geoanalytics platform and techno-economic analysis combining production, financial and environmental data. In this way, they can make correct and timely decisions strengthen their aquaculture's position against competition.The webinar gave an overview of the BlueBRIDGE services supporting aquaculture.Webinar InformationDuration: 1 hourStart date: 22nd November 2017Start time: 12:00pm CETTimezone: Central European Time (CET)About the Speakers[[{"fid":"365","view_mode":"default","fields":{"format":"default","field_file_image_alt_text[und][0][value]":"gerasimos_antzoulatos","field_file_image_title_text[und][0][value]":"gerasimos_antzoulatos","external_url":""},"link_text":null,"type":"media","field_deltas":{"1":{"format":"default","field_file_image_alt_text[und][0][value]":"gerasimos_antzoulatos","field_file_image_title_text[und][0][value]":"gerasimos_antzoulatos","external_url":""}},"attributes":{"alt":"gerasimos_antzoulatos","title":"gerasimos_antzoulatos","height":280,"width":227,"style":"width: 100px; height: 123px; float: left;","class":"media-element file-default","data-delta":"1"}}]]Gerasimos S. Antzoulatos is a Data Analyst at I2S. Gerasimos holds a Degree in Mathematics and a M.Sc. Degree in “Computer Mathematics and Decision Making” from University of Patras. His research interests include Computational Intelligence methods and their application to Data Mining and Knowledge Discovery, Business Intelligence and Predictive Analytics. He is participating in the BlueBridge project on behalf of the I2S S.A. as a researcher, focusing on the development of Machine Learning prediction models for evaluation the aquaculture’s performance. In addition, he supports aquafarmers to benchmark and decision-making processes using blueBRIDGE services. [[{"fid":"366","view_mode":"default","fields":{"format":"default","field_file_image_alt_text[und][0][value]":"charalampos_dimitrakopoulos","field_file_image_title_text[und][0][value]":"charalampos_dimitrakopoulos","external_url":""},"link_text":null,"type":"media","field_deltas":{"2":{"format":"default","field_file_image_alt_text[und][0][value]":"charalampos_dimitrakopoulos","field_file_image_title_text[und][0][value]":"charalampos_dimitrakopoulos","external_url":""}},"attributes":{"alt":"charalampos_dimitrakopoulos","title":"charalampos_dimitrakopoulos","height":200,"width":211,"style":"width: 100px; height: 95px; float: left;","class":"media-element file-default","data-delta":"2"}}]]Charalampos Dimitrakopoulos is an Information Technology Consultant at CITE. He has been actively involved in the BlueBRIDGE project as a researcher and reporter on the implementation of a techno-economical tool for aquaculture management. Currently studying for his Master 's Degree focused in Business Administration and Management from Athens University of Economics and Business . Holding a Bachelor from the Department of Mathematics from University of Patras and a Master of Science from the Department of Banking and Financial Management from University of Piraeus. [[{"fid":"367","view_mode":"default","fields":{"format":"default","field_file_image_alt_text[und][0][value]":"giota_koltsida","field_file_image_title_text[und][0][value]":"giota_koltsida","external_url":""},"link_text":null,"type":"media","field_deltas":{"3":{"format":"default","field_file_image_alt_text[und][0][value]":"giota_koltsida","field_file_image_title_text[und][0][value]":"giota_koltsida","external_url":""}},"attributes":{"alt":"giota_koltsida","title":"giota_koltsida","height":202,"width":205,"style":"width: 100px; height: 99px; float: left;","class":"media-element file-default","data-delta":"3"}}]]Panagiota Koltsida is a computer science researcher and software engineer at the university of Athens (NKUA) and ATHENA Research Center. She has received both her BSc. (2006) and MSc (2009) from the department of Informatics and Telecommunications at the University of Athens with more than 10 years of experience. Her research interests include Information Retrieval systems, Web Information Systems and Data and Metadata Management. She is participating in the BlueBridge project on behalf of the University of Athens, focusing on the development of the Geospatial multi factor optimisation and alerting platform and the search framework
-
8 years 9 months ago | cite-adminxWCPS (XPath Enabled WCPS), is a Query Language (QL) defined in the context of EarthServer 2 Project aiming to merge two widely adopted standards, namely XPath 2.0 because of its capabilities in XML (common metadata form) handling and WCPS's raster data processing abilities, into a new construct, which enables simultaneous processing of both coverage metadata and OGC coverage content.
-
Archived13 years 5 months ago | cite-adminVirtualisation brings a whole new set of tools to the toolbox of a modern IT infrastructure. In this text we briefly present our perception of virtualisation and a brief description techniques and technologies that CITE applies for virtualising IT infrastructures.Virtualisation Pros and ConsProsFlexibility: Ιnfrastructure can be reshaped without new hardwareEasier to manage on a per machine basis, as virtualised hardware tends to be uniform.Easier to migrate in new hardware as most physical resources are virtualised and fully hidden from the guest machine, while the benefits of increased performance are directly visible in the guest.Increased base OS compatibility, as older operating systems and applications can be hosted on modern and in principle incompatible hardware.Increased failure resilience as it is easier to backup / recover systems, move into new hardware, duplicate systems. Related features are offered directly by the virtualisation platform.Note: Different virtualisation techniques might reduce some of the aforementioned advantages of virtualisation, for the benefit of performance.ConsReduced performance: while performance drop can be negligible for CPU and memory operations, it can become substantial for I/O (disk, network, USB) depending on virtualisation techniques applied.Reduced advanced hardware compatibility at the guest OS: as resources are virtualised they become mostly invisible to the guest. Solutions that overcome or soften the effects of this do exist, such as USB sharing, paravirtualised drivers, specialised host drivers, etc.Virtualisation ConceptsGuest: The virtual machine that resides on a physical machine and a virtualisation technique. Host: The physical machine where virtualisation is applied.Paravirtualisation: A set of techniques that require that the guest operating system is directly or indirectly aware of the fact that it is a guest operating system. Modified kernels and drivers are some of the techniques applied.Full Virtualisation: A technique that requires no modifications of the guest system. Can be achieved with or without special hardware.Hardware Assisted Virtualisation: Virtualisation that depends on extended capacities of the hardware (like vt-x and vt-d).Virtualisation PlatformsIn what follows we present a brief summary of hardware virtualization solutions that we have expertise on.XenXEN is open source virtual machine monitor on top of which more guest domains (virtual machines) are hosted. The XEN hypervisor is the lowest layer of a XEN server. Through this layer all virtual machines access the hardware as it is only the hypervisor that has direct access to the physical system resources. The XEN hypervisor is installed as the first guest domain (Dom0) and it is a properly modified version of a UNIX-type operating system. After installing the hypervisor we have a XEN system on which we can create many unprivileged guest domains. Those unprivileged guest domains are called Domain-Us or DomUs. Dom0 provides the tools for creating resources and managing those DomUs. XEN hypervisor supports two types of DomUs: paravirtualized and hardware virtual machines (HVMs). Regardless the virtualization type, every guest domain is isolated from the others and also none of the DomUs have direct access to the systems’ physical hardware. For using paravirtulization (or else software virtualization) technique we must use a modified UNIX-like operating system as a DomU. In that case the DomUs’ operating system is aware that is running on top of XEN hypervisor thus it is modified so that can communicate directly with it. When paravirtualization is emploied we do not need any special hardware-assisted virtualization technology (AMD-V, Intel VT). Such modified operating systems, to be used as paravirtualized DomUs, are available for several flavours of UNIX-like operating systems. In the case of Hardware Virtual Machines we are allowed to use any unmodified operating system. In this case the guest operating system is not aware that is running on a hypervisor therefore we need a hardware-assisted virtualization technology (AMD-V, Intel VT) along with a BIOS capable of enabling the CPU’s virtualization capabilities. HVMs operating system can be any UNIX-like or Windows operating system.Hyper-VHyper-V is a hypervisor-based virtualization solution offered from Microsoft Corporation. Hyper-V is integrated to Microsoft Windows Server 2008 (standard, enterprise, datacenter) and is also available as a standalone version of the Hyper-V role in Server 2008 called Microsoft® Hyper-V™ Server 2008. The architecture of Hyper-V is similar to that found in XEN. Here we also have a Hyper-V hypervisor at the lowest layer paired with a privileged Dom0 having direct access to system hardware. Hyper-V supports Hardware Virtual Machines as unprivileged guests and requires hardware-assisted virtualization technology (AMD-V, Intel VT). HVMs run isolated, not aware the Hyper-V’s exiastance. They have also no direct access to physical system hardware. Managing of virtual machines in case of using Microsoft Windows Server 2008 with Hyper-V role enabled is done through Windows Server 2008 and in case of using Microsoft® Hyper-V™ Server 2008 through shell or remotely. Operating systems available for HVMs include Windows and UNIX variations.VMwareVMware offers a range of virtualization products some of which runs as desktop applications and some standalone. VMware vSphere (commercial) and VMware vSphere Hypervisor (free) are enterprise-class virtualization solution. VMware vSphere Hypervisor is based on VMware ESXi. Unlike XEN and Hyper-V it uses hardware vendors’ drivers and a POSIX-like kernel developed by VMware which is called VMkernel and which fully manages the virtual server. Virtual machines run on top of the VMkernel. Supported virtualized operating systems are Windows, Unix-like, Netware and more.KVM (Kernel Based Virtual Machine)KVM (Kernel-based Virtual Machine) is an open source virtualization solution for Linux on x86 hardware. It requires hardware-assisted virtualization technology (AMD-V, Intel VT) and a BIOS capable to enable CPU virtualization. KVM instead of having a "bare metal," hypervisor uses linux kernel as hypevisor through a loadable kernel module, kvm.ko, and a processor specific module, kvm-intel.ko or kvm-amd.ko, depending on systems’ CPU. Also requires a modified version of QEMU to virtualize hardware resources. In KVM a virtual machine is implemented as a linux process. Supported guest operating systems include Windows, Unix-like, Netware and more.Disclaimer: The article expresses personal opinions of experts in CITE. As such it cannot be considered as a documented comparison or analysis of the aforementioned hypervisors and their characteristics.