Program synthesis can automatically produce code. Its applications range from web automation, hardware security, operating system extensions, programming for non-programmers, authoring of SQL queries, configuration management, automatic code translation, and superoptimization.
Come learn how program synthesis can supercharge your workflow and how it complements machine learning. Attend talks by the world's experts in program synthesis and participate in hands-on tutorials. Learn about the technology, its capabilities and limitations, and the current state-of-the-art of synthesis tools.
How much intelligence does a computer need to reach parity with human programmers? Not that much, it turns out --- it only needs to write a perfect program with up to 50 operations. Automatic program synthesis has recently reach this level of performance on diverse programming tasks, including synthesis of SQL queries, end-user web scripts, and crash consistency models. I will sample these results and describe the methodology and algorithms behind recent synthesizers. I will end with the discussion of opportunities and challenges in deploying synthesis in the construction of bigger systems, including complementing machine learning by synthesizing the hard, sound, inapproximable components of Software 2.0.
⬆ Back to Top ⬆We have promised social scientists a data revolution, but it hasn’t arrived. What stands between practitioners and the data-driven insights they want? Acquiring the data. In particular, acquiring the social media, online forum, and other web data that was supposed to help them produce big, rich, ecologically valid datasets. Web automation programming is resistant to high-level abstractions, so end-user programmers end up stymied by the need to reverse engineer website internals—DOM, JavaScript, AJAX. Programming by Demonstration (PBD) offered one promising avenue towards democratizing web automation. Unfortunately, as the web matured, the programs became too complex for PBD tools to synthesize, and web PBD progress stalled. I’ll describe how I reformulated traditional web PBD around the insight that demonstrations are not always the easiest way for non-programmers to communicate their intent. By shifting from a purely Programming-By-Demonstration view to a Programming-By-X view that accepts a variety of user-friendly inputs, we can dramatically broaden the class of programs that come in reach for end-user programmers. My Helena ecosystem combines (i) usable PBD-based program drafting tools, (ii) learnable programming languages, and (iii) novel programming environment interactions. The end result: non-coders write Helena programs in 10 minutes that can handle the complexity of modern webpages, while coders attempt the same task and time out in an hour.
⬆ Back to Top ⬆Advances in domain-specific languages (DSLs) have enabled developers build efficient applications across many different disciplines (e.g., Spark for data analytics, CUDA for GPUs, etc). To leverage such DSLs, however, programmers need to learn new abstractions and rewrite their existing applications to leverage the optimizations offered by such languages. For developers, this is a tedious, error-prone job; for DSL designers, this hinders uptake of their language. In this talk, I will first describe our work on using verified lifting to build compilers that translate general-purpose programming languages into DSLs. For each input code to be translated, verified lifting uses program synthesis to automatically find both a program in the target language, and a proof of semantic equivalence between the input and the synthesized output programs. As we applied verified lifting to a number of different domains, however, we found that building such compilers still takes a non trivial amount of effort, as users need to implement an abstract syntax tree to represent the target language, and build a code synthesizer for their domain. To tackle these challenges, if time permits I will discuss MetaLift, a new framework we are building that enables users to easily construct compilers using verified lifting.
⬆ Back to Top ⬆In this talk I'll describe a new automatic scheduler for imaging and learning pipelines written in Halide. It learns to schedule by generating random plausible Halide algorithms, randomly scheduling them, and then training a model that regresses from features of the pipeline and schedule to runtime. Applied to real code, we can generate fast code in seconds by finding the schedule that minimizes predicted runtime using a variant of beam search. This mode does not require benchmarking on the target device, yet performs on par with expert humans. We can generate faster code in hours by adding a layer of autotuning on top of this. This mode produces schedules that are on average 40% faster than the best expert humans on x86 CPUs. The scheduling transformations exploited include deeply nested fusion-in-tiles, sliding window strategies, inlining, unrolling, register blocking, and storage layout reordering.
⬆ Back to Top ⬆Operating systems allow users to customize system policies by executing user programs in-kernel. Bugs in the in-kernel compiler could lead to serious security vulnerabilities. This talk will show how to synthesize a provably correct JIT compiler for such policy programs.
⬆ Back to Top ⬆Web applications must run correctly across a diverse landscape of devices and user configurations. VizAssert verifies accessibility of a web app by implementing a web browser in the constraint solver, effectively testing the app on all possible configurations at once. VizAssert has identified dozens of accessibility bugs and provided feedback that enabled improving accessibility. This talk will explain how VizAssert works and describe how to apply it to your own websites.
⬆ Back to Top ⬆Recent research has uncovered a broad class of security vulnerabilities in which confidential data is leaked through programmer-observable microarchitectural state. In this talk, I will present CheckMate, a rigorous approach and automated tool for determining if a microarchitecture is susceptible to specified classes of security exploits, and for synthesizing proof-of-concept exploit code when it is. Our approach adopts “microarchitecturally happens-before” (µhb) graphs which prior work designed to capture the subtle orderings and interleavings of hardware execution events when programs run on a microarchitecture. CheckMate extends µhb graphs to facilitate modeling of security exploit scenarios and hardware execution patterns indicative of classes of exploits. Furthermore, it leverages relational model finding techniques to enable automated exploit program synthesis from microarchitecture and exploit pattern specifications. In addition to presenting CheckMate, I will describe a case study where we use CheckMate to evaluate the susceptibility of a speculative out-of-order processor to FLUSH+RELOAD cache side-channel attacks. The automatically synthesized results are programs representative of Meltdown and Spectre attacks. We also evaluate the same processor on its susceptibility to a different timing side-channel attack: PRIME+PROBE. Here, CheckMate synthesized new exploits that are similar to Meltdown and Spectre in that they leverage speculative execution, but unique in that they exploit distinct microarchitectural behaviors— speculative cache line invalidations rather than speculative cache pollution—to form a side-channel. Most importantly, these results validate the CheckMate approach to formal hardware security verification and the ability of the CheckMate tool to detect real-world vulnerabilities.
⬆ Back to Top ⬆The key to attaining more general artificial intelligence is to develop architectures that are capable of learning complex algorithmic behaviors modeled as programs. The ability to learn programs can allow these architectures to learn to compose high-level abstractions that can in turn lead to many benefits: i) enable neural architectures to perform more complex tasks, ii) learn interpretable representations (programs which can be analyzed, debugged, or modified), and iii) better generalization to new inputs (like algorithms). In this talk, I will present some of our recent work in developing neural architectures for learning programs from examples and natural language.
⬆ Back to Top ⬆Modern computer systems are so heavily optimized that the resulting complexity overwhelms users and designers alike. For example, it is unclear how the various Linux file systems differ in terms of preserving data integrity after a system crash. This talk will feature synthesizers that discover corner-case behaviors and synthesize a concise behavior descriptions that experts struggle to write.
⬆ Back to Top ⬆The difficulty of building an optimizing compiler has become more prevalent as a plethora of new accelerators have been created to speed up various workloads ranging from machine learning to packet processing. This is because today’s production compilers still rely on uncountable manually-written heuristics and complex rewrite rules, both of which are hard to develop, to generate efficient code. Therefore, it is hard to keep up with new hardware being created if we stick with this approach. I will briefly present three projects I have been working on during my PhD and at Google that help mitigate this problem. First, I will describe how we can avoid writing rewrite rules and heuristics at the assembly-level code generation using GreenThumb, a superoptimizer construction framework. Second, I will demonstrate how we can leverage the collaboration between a synthesizer and human programmers to create efficient GPU kernels that exploit advanced data movements. Last, I will talk about my current work on autotuning the XLA TPU compiler instead of relying on heuristics and an effort toward building a self-evolving compiler.
⬆ Back to Top ⬆This talk will describe how to efficiently find bugs in smart contracts, and how to do so by constructing the analyzer largely automatically, from just the specification of the smart contract programming language.
⬆ Back to Top ⬆Program synthesis traditionally entails generating programs from formal and verifiable specifications such as input-output examples, traces, or assertions. However, many real-life applications admit only fuzzier specification mechanisms such as natural language descriptions or even just the surrounding context. In the machine learning community, program generation from natural language is known as semantic parsing, and while advanced, it often struFromggles with generating correct and executable programs. In this talk, I'll describe some of our recent forays in the world of combining semantic parsing with formal methods. We'll see how partial program execution and relational database analysis help NL to SQL generation, and how automatic mining of idiomatic operators from a code corpus helps synthesize high-level programs.
⬆ Back to Top ⬆Data analytics requires sophisticated SQL queries, forcing programmers to seek help from SQL experts. This talk will demo a tool that automatically synthesizes SQL queries from the questions posed on Stack Overflow.
⬆ Back to Top ⬆Hardware evolves at frantic pace, making past investments in software obsolete. This talk will demo a tool that “rejuvenates” Adobe Photoshop by synthesizing the algorithms hidden in the source code. The algorithms are then automatically compiled to new hardware, avoiding manual reimplementation.
⬆ Back to Top ⬆The conference will be held in the Bay Area on September 19 and 20th. To attend, please fill the form at the link below.