Month: September 2018

How 20 big-name US VC firms invest at Series A & B – Pitchbook

NEA is one of the most well-known investors around, and the firm also takes the crown as the most active VC investor in Series A and B rounds in the US so far in 2018. Andreessen HorowitzAccel and plenty of the other usual early-stage suspects are on the list, too.

Also included is a pair of names that have been in the news this year for backing away from the traditional VC model: Social Capital and SV Angel. The two are on the list thanks to deals completed earlier in the year.

Just how much are these prolific investors betting on Series A and Series B rounds? And at what valuation? We’ve used data from the PitchBook Platform to highlight a collection of the top venture capital investors in the US (excluding accelerators) and provide information about the Series A and B rounds they’ve joined so far this year. Click on the graphic below to open a PDF.

Source :

Corporate Venture Investment Climbs Higher Throughout 2018 – Crunchbase

Many corporations are pinning their futures on their venture investment portfolios. If you can’t beat startups at the innovation game, go into business with them as financial partners.

Though many technology companies have robust venture investment initiatives—Alphabet’s venture funding universe and Intel Capital’s prolific approach to startup investment come to mind—other corporations are just now doubling down on venture investments.

Over the past several months, several big corporations committed additional capital to corporate investments. For example, defense firm Lockheed Martinadded an additional $200 million to its in-house venture group back in June. Duck-represented insurance firm Aflac just bumped its corporate venture fund from $100 million to $250 million, and Cigna just launched a $250 million fundof its own. This is to say nothing of financial vehicles like SoftBank’s truly enormous Vision Fund, into which the Japanese telecoms giant invested $28 billion of its own capital.

And 2018 is on track to set a record for U.S. corporate involvement in venture deals. We come to this conclusion after analyzing corporate venture investment patterns of the top 100 publicly traded U.S.-based companies (as ranked by market capitalizations at time of writing). The chart below shows that investing activity, broken out by stage, for each year since 2007.

A few things stick out in this chart.

The number of rounds these big corporations invest in is on track to set a new record in 2018. Keep in mind that there’s a little over one full quarter left in the year. And although the holidays tend to bring a modest slowdown in venture activity over time, there’s probably sufficient momentum to break prior records.

The other thing to note is that our subset of corporate investors have, over time, made more investments in seed and early-stage companies. In 2018 to date, seed and early-stage rounds account for over sixty percent of corporate venture deal flow, which may creep up as more rounds get reported. (There’s a documented reporting lag in angel, seed, and Series A deals in particular.) This is in line with the past couple of years.

Finally, we can view this chart as a kind of microcosm for blue-chip corporate risk attitudes over the past decade. It’s possible to see the fear and uncertainty of the 2008 financial crisis causing a pullback in risk capital investment.

Even though the crisis started in 2008, the stock market didn’t bottom out until 2009. You can see that bottom reflected in the low point of corporate venture investment activity. The economic recovery that followed, bolstered by cheap interest rates, that ultimately yielded the slightly bloated and strung-out market for both public and private investors? We’re in the thick of it now.

Whereas most traditional venture firms are beholden to their limited partners, that investor base is often spread rather thinly between different pension funds, endowments, funds-of-funds, and high-net worth family offices. With rare exception, corporate venture firms have just one investor: the corporation itself.

More often than not, that results in corporate venture investments being directionally aligned with corporate strategy. But corporations also invest in startups for the same reason garden-variety venture capitalists and angels do: to own a piece of the future.

A Note On Data

Our goal here was to develop as full a picture as possible of a corporation’s investing activity, which isn’t as straightforward as it sounds.

We started with a somewhat constrained dataset: the top 100 U.S.-based publicly traded companies, ranked by market capitalization at time of writing. We then traversed through each corporation’s network of sub-organizations as represented in Crunchbase data. This allowed us to collect not just the direct investments made by a given corporation, but investments made by its in-house venture funds and other subsidiaries as well.

It’s a similar method to what we did when investigating Alphabet’s investing universe. Using Alphabet as an example, we were able to capture its direct investments, plus the investments associated with its sub-organizations, and their sub-organizations in turn. Except instead of doing that for just one company, we did it for a list of 100.

This is by no means a perfect approach. It’s possible that corporations have venture arms listed in Crunchbase, but for one reason or another the venture arm isn’t listed as a sub-organization of its corporate parent. Additionally, since most of the corporations on this list have a global presence despite being based in the United States, it’s likely that some of them make investments in foreign markets that don’t get reported.

Source :

Visualizing Better Transportation: Data & Tools – Steve Pepple

There is a wide array of data, products, resources, and tools available and in the spirit of “emergence” and getting data out of silos, this blog post list a bunch of them. The tools, techniques, resources also make it possible to combine data in insightful ways.


When you start working with data around transportation and geospatial analysis, you’ll enter a world full of technical terms and acronyms. It can be daunting at first, but you can learn step by step and there are countless resources to help you along the way.

Before you jump into data, here are a few essential resources and tools to take you from the basics (no coding required) to pro techniques:

Transit Tools

There a number of data tools you can use to analyze and visualize transportation and geospatial data without needing to code.

Mobility Explorer from TransitLand.

Transportation & Mobility Data

Now that we’ve looked at some essential tools for mapping and analyzing data, let’s look at interesting data to visualize.

The following organizations are doing exciting work in transportation and mobility. They will be showcase data and tools at our event on Sept. 26th:

  • ARUP— An independent firm of designers, planners, engineers, consultants and technical specialists working across every aspect of today’s built environment.
  • SFCTA & SFMTA Emerging Mobility Committee —A joint committee between agencies that has create principles for mobility services and a number of useful tools for explorer transit and mobility in San Francisco.
  • Remix – Envision ideas, collaborate, and implement plans with a platform for the modern, multimodal city. They have a new tool for designing and visualizing scooter and bicycles.
  • Strava Metro —Plan and build better active transportation infrastructure by partnering with a global community of people on the move.
  • Swiftly — Data analytics for transit agencies for improving service quality, efficiency, and reliability.
Data Visualization from CTA Emerging Mobility, Strava, and Swiftly.

And here are a number of other datasets from other companies and organizations:

  • 311 Dashboard — Explore 311 complaints and service request in San Francisco.
  • Portal — Developer portal and open data for 511 Bay Area including data for AC Transit, BART, Caltrain,, SFMTA, SamTrans, and other transit operators.
311 Data Explorer & 511 Trip Planner and Developer Resources
Visualization by JUMP Bikes; Ford GoBike trips visualized by Steve Pepple using Carto.
  • NextBus — Provides real-time location data for a number of transportation agencies. Here is documentation on their developer API.
  • — It is a data standard and a platform that serves as a launching pad for public-private collaboration and a clearinghouse for data exchange.
  • San Francisco Municipal Transportation Agency (SFMTA) — Provides an interactive project map. The agency also has an open data initiative in the works to aggregate data from emerging mobility services providers.
  • TNCs Today — Provides a data snapshot of Transportation Network Companies (TNCs), such as Lyft and Uber, operating in San Francisco.
  • Transitland — An aggregation of transit networks maintained by transit enthusiasts and developers.
  • Vital Signs Data Center — Explore a wide variety of public datasets related to transportation, land use, the economy, the environment, and social equity.
Example of Resident Travel by Transportation from MTC Vital Signs; TNCs today from SFCTA

Tools & Code

Once you have the data you to want to explore and analyze, try these useful tools and libraries for analyzing and visualizing transportation and spatial data.

  • D3.js — D3.js Check out all the examples in Mike Bostock’s website. For example, here is how to create a real-time transit map of San Francisco.
  • — Open source data visualization tools from Uber. Especially good for visualization of large datasets in WebGL maps.
  • Esri Transit Tools — Tools for ESRI and ArcGIS users working with transit and network data.
  • — Open Source geocoder (based on Mapzen’s Pelias) that allows users to look up geographic coordinates of addresses and vice versa. MapboxCARTO, and Esri also have search APIs for geocoding addresses.
  • Leaflet.js — the best frontend library for working with the display of points, symbols, and all types of features on the web and mobile devices. The library supports rectangles, circles, polygons, points, custom markers, and a wide variety of layers. It performs quickly, handles a variety of formats, and makes styling of map features easy.
  • Opensource Routing Machine — OSRM is a project for routing paths between origin and destination in road networks. Mapbox also has a turn-by-turn Directions API and Nokia Here has a service that supports transit.
  • Open source Planning Tools — An extension of GFTS for for transportation planning and network analysis.
  • Replica — A city planning tool from Sidewalk labs for exploring and analyzing where people move. Here’s Nick Bowden’s post about how the tool used de-identified or anonymous mobility and foot traffic data to model how people travel in urban areas.
  • Turf.js — Mapbox library for geospatial analysis in the browser. Turf’s create collection of geographic features and then quickly spatially analyze, process, and simplify the data before visualizing it.
  • UrbanSim — An open source simulation platform for supporting planning and analysis of urban development, incorporating the interactions between land use, transportation, the economy, and the environment. You can check out a simulation of the Bay Area on MTC portal.

Source :

Is Your AI SoC Secure – Where security is needed in AI environments

As artificial intelligence (AI) enters every application, from IoT to automotive, it is bringing new waves of innovation and business models, along with the need for high-grade security. Hackers try to exploit vulnerabilities at all levels of the system, from the system-on-chip (SoC) up. Therefore, security needs to be integral in the AI process. The protection of AI systems, their data, and their communications is critical for users’ safety and privacy, as well as for protecting businesses’ investments. This article describes where security is needed throughout AI environments as well as implementation options for ensuring a robust, secure system.

Where and why AI security is needed
AI applications built around artificial neural networks (ANNs, or simply “neural nets”) include two basic stages – training and inference (Figure 1). The training stage of a neural network is when the network is “learning” to do a job, such as recognizing faces or street signs. The resulting data set for the configuration (weights representing the interaction between neurons) of the neural net is called a model. In the inference stage, the algorithm embodied in the model is deployed to the end application.

Figure 1: The training and inferencing stages of deep learning and AI.

The algorithms used in neural net training often include data that require privacy, such exactly how faces and fingerprints are collected and analyzed. The algorithm is a large part of the value of any AI technology. In many cases, the large training data sets that come from public surveillance, face recognition and fingerprint biometrics, financial, and medical applications are private and often contain personally identifiable information. Attackers, whether organized crime groups or business competitors, can take advantage of this information for economic reasons or other rewards. In addition, the AI systems face the risk of rogue data injection maliciously sent to disrupt a neural network’s functionality (e.g., misclassification of face recognition images to allow attackers to escape detection). Companies that protect training algorithms and user data will be differentiated in their fields from companies that suffer from the negative PR and financial risks of being exploited. Hence, it is highly important to ensure that data is received only from trusted sources and protected during use.

The models themselves, represented by the neural net weights, and the training process are incredibly expensive and valuable intellectual property to protect. Companies that invest to create and build these models want to protect them against disclosure and misuse. The confidentiality of program code associated with the neural network processing functions is considered less critical. However, access to it can aid someone attempting to reverse engineer the product. More importantly, the ability to tamper with this code can result in the disclosure of all assets that are plaintext inside the security boundary.

In addition to protecting data for business reasons, another strong driver for enforcing personal data privacy is the General Data Protection Regulation (GDPR) that came into effect within the European Union on May 25, 2018. This legal framework sets guidelines for the collection and processing of personal information. The GDPR sets out the principles for data management protection and the rights of the individual, and large fines may be imposed on businesses that do not comply with the rules.

As data and models move between the network edge and the cloud, communications need to be secure and authentic. It is important to ensure that the data and/or models are protected and are only being communicated and downloaded from authorized sources to devices that are authorized to receive it.

Security solutions for AI
Security needs to be incorporated starting from product concept throughout the entire lifecycle. As new AI applications and use cases emerge, devices that run these applications need to be capable of adapting to an evolving threat landscape. To address the high-grade protection requirements, security needs to be multi-faceted and “baked-in” from the edge devices incorporating neural network processing system-on-chips (SoCs) right through to the applications that run on them and carry their data to the cloud.

At the outset, system designers adding security to their AI product must consider a few security enablers that are foundational functions that belong in the vast majority of products, including AI, to protect all phases of operation: offline, during power up, and at runtime, including during communication with other devices or the cloud. Establishing the integrity of the system is essential to creating trust that the system is behaving as intended.

Secure bootstrap, an example of a foundational security function, establishes that the software or firmware of the product is intact (“has integrity”). Integrity assures that when the product is coming out of reset, it is doing what its manufacturer intended – and not something a hacker has altered. Secure bootstrap systems use cryptographic signatures on the firmware to determine their authenticity. While predominantly firmware, secure bootstrap systems can take advantage of hardware features such as cryptographic accelerators and even hardware-based secure bootstrap engines to achieve higher security and faster boot times. Flexibility for secure boot schemes is maximized by using public key signing algorithms with a chain of trust traceable to the firmware provider. Public key signing algorithms can allow the code signing authority to be replaced by revoking and reissuing the signing keys if the keys are ever compromised. The essential feature that security hinges on is that the root public key is protected by the secure bootstrap system and cannot be altered. Protecting the public key in hardware ensures that the root of trust identity can be established and is unforgeable.

The best encryption algorithms can be compromised if the keys are not protected with key management, another foundational security function. For high-grade protection, the secret key material should reside inside a hardware root of trust. Permissions and policies in the hardware root of trust enforce that application layer clients can manage the keys only indirectly through well-defined application programming interfaces (APIs). Continued protection of the secret keys relies on authenticated importing of keys and wrapping any exported keys. An example of a common key management API for embedded hardware secure modules (HSM) is the PKCS#11 interface, which provides functions to manage policies, permissions, and use of keys.

A third foundational function relates to secure updates. Whether in the cloud or at the edge, AI applications will continue to get more sophisticated and data and models will need to be updated continuously, in real time. The process of distributing new models securely needs to be protected with end-to-end security. Hence it is essential that products can be updated in a trusted way to fix bugs, close vulnerabilities, and evolve product functionality. A flexible secure update function can even be used to allow post-consumer enablement of optional features of hardware or firmware.

After addressing foundational security issues, designers must consider how to product the data and coefficients in their AI systems. Many neural network applications operate on audio, still images or video streams, and other real-time data. These large data sets can often have significant privacy concerns associated with them so protecting large data in memory, such as DRAM memory, or stored locally on disk or flash memory systems, is essential. High bandwidth memory encryption (usually AES based) backed by strong key management solutions is required. Similarly, models can be protected through encryption and authentication, backed by strong key management systems enabled by hardware root of trust.

To ensure that communications between edge devices and the cloud are secured and authentic, designers use protocols that incorporate mutual identification and authentication, for example client-authenticated Transport Layer Protocol (TLS). The TLS session handshake performs identification and authentication, and if successful the result is a mutually agreed shared session key to allow secure, authenticated data communication between systems. A hardware root of trust can ensure the security of credentials used to complete identification and authentication as well as the confidentiality and authenticity of the data itself. Communication with the cloud will require high bandwidth in many instances. As AI processing moves to the edge, high-performance security requirements are expected to propagate there as well, including the need for additional authentication, to prevent the inputs to the neural network from being tampered with and to ensure that AI training models have not been tampered with.

Neural network processor SoC example
Building an AI system requires high performance with low-power, area-efficient processors, interfaces, and security. Figure 2 shows a high level architecture view of a secure neural network processor SoC used in AI applications. Neural network processor SoCs can be made more secure when implemented with proven IP.

Figure 2: A Trusted execution environment with DesignWare IP helps secure neural network SoCs for AI applications.

Embedded vision processor with CNN engine
Synopsys EV6x Embedded Vision Processors combine scalar, vector DSP and convolutional neural network (CNN) processing units for accurate and fast vision processing. They are fully programmable and configurable, combining the flexibility of software solutions with the high performance and low power consumption of dedicated hardware. The CNN engine supports common neural network configurations, including popular networks such as AlexNet, VGG16, GoogLeNet, YOLO, SqueezeNet, and ResNet.

Hardware Secure Module with Root of Trust
Synopsys’ tRoot hardware secure module with root of trust is designed to easily integrate into SoCs and provides a scalable platform to offer diverse security functions in a trusted execution environment (TEE) as a companion to one or more host processors, including secure identification and authentication, secure boot, secure updates, secure debug and key management. tRoot protects AI devices using unique code protection mechanisms that provide run-time tamper detection and response, and code privacy protection without the added cost of additional dedicated secure memory. This unique feature reduces system complexity and cost by allowing tRoot’s firmware to reside in any non-secure memory space. Commonly, tRoot programs reside in shared system DDR memory. Due to the confidentiality and integrity provisions of its secure instruction controller, this memory is effectively private to tRoot and impervious to attempts to modify it originating in other subsystems in the chip or from the outside.

Security protocol accelerator
Synopsys DesignWare Security Protocol Accelerators (SPAccs) are highly integrated embedded security solutions with efficient encryption and authentication capabilities to provide increased performance, ease-of-use, and advanced security features such as quality-of-service, virtualization, and secure command processing. The SPAccs offer designers unprecedented configurability to address the complex security requirements that are commonplace in today’s multi-function, high-performance SoC designs by supporting major security protocols such as IPsec, TLS/DTLS, WiFi, MACsec, and LTE/LTE-Advanced.

AI is poised to revolutionize the world. There are incredible opportunities that AI brings, some being realized right now, others yet to come. Providers of AI solutions are investing significantly in R&D capital, and the models derived from the training data (unique coefficients or weights) represent a big investment that needs to be protected. With new regulations like GDPR in place and serious concerns about privacy and confidentiality of people’s data, and huge investments in intellectual property in neural network architecture and model generation, companies providing AI solutions should be leading the charge to put in place secure processes around their AI products and services.

Source :


Steps to Successful Machine Shop Digitalization – Modern Machine Shop

Digitalization of machine shops

Industry 4.0. Industrial Internet of Things (IIoT). Smart manufacturing. The buzzwords abound, but what does this technology really mean for your shop?

“Whether your business is small or large, you should know that digital manufacturing is coming, and it’s coming really quickly,” says Sean Holt, president of Sandvik Coromant for the Americas. “You have to assess how it’s going to affect the sustainability of your business, what are its risks, its benefits, and most importantly, how to take the first steps towards digitalization.”

“Who cares?” you might be thinking. “I just want to make good parts, on-time, and for a fair price. That’s what’s most important to me.”

Consider this: any seasoned machinist or programmer can walk up to a machine and know instantly if something is awry. There’s just one problem: finding those qualified people is increasingly arduous, and most shops need their operators to manage multiple machines. It would be a huge advantage to have another way to know that the parts being made on Machine #5 right now are about to go out of tolerance, and further, that the spindle bearings on VMC #2 will fail in three weeks.

The path to those capabilities is data; cutting tool data, machine data, quality data, operator productivity data. It may sound simplistic, but that’s the essence of Industry 4.0 and the Internet of Things: the collection and analysis of data, followed by better decision-making as a result of these data-related efforts. To the shop of the future, data will be everything.

That’s why many equipment builders, and now tooling suppliers, are making their products “smart,” giving manufacturers the ability to “listen” to what the shop floor is telling them with data that is easy to understand.

In a controlled machining process where decisions and actions are based on real-time information and facts, manufacturers are able to optimize their processes and significantly reduce their manufacturing-related waste. CoroPlus® ProcessControl monitoring system—including hardware, software, installation and support—increases the overall in-cut process stability and security, ultimately enabling increased productivity and profitability.

Shops that embrace digitalization will have much better information with which to operate their businesses because:

  • Their machines and cutting tools will be able to identify chatter (even when it can’t be heard), and part quality, and tool life will be better.
  • Understanding what’s going on inside spindle bearings, axis motors and other machine components will prevent unexpected equipment failures.
  • Knowing exactly how hard to push cutting tools will increase production levels.
  • The ability to more easily spot trends in part quality and tool wear will help machinists and engineers develop better processes.

Start Small

vibration sensing boring tools

Getting started is easier and less expensive than you probably expect—you won’t need a big budget ($1,000 or so should do), or the technical skills of a computer scientist. “I tell people to start small,” says Andy Henderson, vice-president of engineering at industrial technology firm company Praemo. “Hook up one machine, start collecting some data, and then let the value you’re receiving from that machine pay for the next one, and the next, scaling upwards as you go.”

At the very least, getting your machine tools “connected” will let you check production status from anywhere. Taken to the next level, you can gather hundreds or even thousands of data points from a modern machine tool, including in-process metrology data, machine maintenance information, production output, scrap levels, cutting tool usage, job status…the list goes on and on.

But don’t do that. At least, not yet. Better to pick a pilot machine, choose one or two of whichever production values or machine metrics are most important to you, and start watching the data flow in. You’ll soon spot causes of downtime that are expensive to the shop but simple to cure. Areas for continuous improvement will become abundantly clear. Unexpected failures will eventually become a thing of the past.


Hierarchy of benefits from digitalization of machine shops.

Additional benefits accrue as shops move up the scale of digitalization. Simply finding out why your equipment isn’t running is a good start. Ultimately, it lets shops focus sharply on the best value-adding activities reaping the highest returns on equipment and people.

Worried about the cost? Don’t be. According to Will Sobel, co-founder and chief strategy officer of advanced manufacturing analytics software company VIMANA, the ROI can be “amazingly ridiculous,” sometimes as short as a few weeks. “If you look at a typical manufacturing processes in a typical shop, equipment utilization is often around 30-percent,” he points out. “It doesn’t take much to improve that figure.”

​Move Rapidly

Stas Mylek, Mastercam developer, CNC Software’s director of product management, says those looking for a quick win should consider purchasing monitoring software. “There are plenty of applications out there that you can get for minimal investment and that make the traditional green, yellow, and red indicator lights obsolete,” he says. “Using such an application to collate and make sense of data allows you to better understand your processes and where each machine is making money.”

digital gateway of a machining operation

A connected shop provides actionable intelligence to all levels of the enterprise on
how to manufacture more efficiently.

Having good monitoring software is one thing; acting on that information is another. And shops will be well served by appointing a data evangelist (or team, depending on the size of the company) to chase down improvement opportunities. This person will work with suppliers, report back to management, and work to spread the good word of digitalization throughout the organization.

“There goes the budget,” you may say. And while it’s true that taking your IIoT data collection pilot project to the next level will cost the company some cash—in infrastructure, hardware and software, and additional labor costs—it’s important to remember that the additional visibility to production and machine tool data, and the benefits derived from both, will greatly outweigh any investment costs.

That’s not to say that the appointed data evangelist should pound people on the head with his or her findings. For one thing, this person will typically have less manufacturing skill and experience than the machinists, programmers, and engineers responsible for part production each day—a talented but technically-oriented machine operator is a good choice for such a role, one able to communicate effectively while recognizing that the people he or she is working with may be reluctant to change their ways.

And there is still a vital role for your experienced people to play. Software doesn’t make great choices when comparing different solutions. That’s why humans will always be better (for the foreseeable future, at least) about when to shut a machine down, for example, or the best way to adjust feeds and speeds when chatter occurs. This is why involvement from the entire manufacturing team is crucial to any Industry 4.0 implementation.

Ready to pull the trigger? All it takes is a connected machine tool, a little data and willingness to change. Properly implemented, the results will be greater throughput and higher profit margins. Get going. Industry 4.0 is waiting


Source :


Law – Why Would Big Tech Companies Give Away Free Patents?

You may have heard that patent-rich tech companies like Canon, Red Hat and Lenovo are giving away free patents to startups that join a patent protection community. It’s part of a program through the nonprofit of which I’m a part, LOT Network, which is a group of companies ranging from startups to well-known names like Tesla, Facebook Inc., Lyft Inc. and Inc. that protect one another from litigation from patent assertion entities (PAE, or “patent trolls”).

Though we live in a world where we’re trained to suspect anything that’s given away for free, there are valid and self-preserving reasons for tech companies to give away patents for free that concurrently benefit startups as well.

Supporting Innovation

Many of the companies donating patents were early pioneers in their industries, and continue to have a culture that values innovation.

While each has its own reason for donating patents—Red Hat, for instance, is participating in part to build bridges with software engineers who are leery of software patents, while Lenovo has decided to shed its patent portfolio of patents in areas where the business is no longer involved as a cost-saving measure—all share a desire to do something positive with them. At a minimum, these companies decided in the interest of promoting innovation to donate thousands of free patents to member startups.

By donating patents—and allowing startups to kick-start their patent portfolios for free—the big tech companies are providing startups with a huge benefit. These companies have spent decades investing in research and development, and developing IP capabilities, which allow them to create high quality patents with real value. These companies are pro-innovation and want to help support the technology leaders of tomorrow by giving them a quick leg up on the IP side.

Having a patent portfolio can increase the value of a startup, and make it more attractive to potential investors by showing that there is real substance to its IP strategy. A patent is also an asset that can be asserted or sold. As proof of the authenticity of this program, it is worth noting there are no strings attached. Startups are not required to give up equity or money to obtain the patents, they are not required to stay in LOT for any minimum period of time, and can sell or abandon the patents anytime.

It can take 12-72 months before a patent application is granted, and between $10,000-$50,000 in legal costs and fees to obtain a patent. Thus, the donation program allows startups to save two of its most precious resources: time and money.

We recognize that startups are likely new to the patent “game.” That is why we wanted to assist companies in selecting which assets are best for them. Legal counsel and advisers are available to help startups select from which patents are the best fits for a startup’s business, and can serve as a resource to answer questions later as each startup builds its IP program.


The patent donation program is in part an incentive to get startups to join a community that is about protecting and promoting innovation. Today, the community is immunized against over 1.1 million patent assets in the hands of a patent troll. The herd gets stronger with each new member, regardless of the size of the new member.

More than half of companies sued by trolls make less than $10 million in revenue. We also appreciate that cash is king with startups, and we did not want cost to be a barrier to protecting the next generation of innovators. That is why membership is free for any startup with less than $25 million in revenue. Even if the startup doesn’t have any patents.

We see this as an investment in innovation that benefits everyone equally in the community.  Disruptive startups are the big tech companies of tomorrow. They will eventually have substantial patent portfolios—patents for which our community members want reciprocal immunity from patent troll lawsuits. Thus, the community is providing immunity from troll suits today to startups, in exchange for the startup protecting the membership from troll suits in the future. At the same time, LOT members are free to use their patents in all the traditional ways: sell them, license them and assert them against competitors in and outside the network.

This self-preserving strategy is good for startups too—without any cost to them. By joining our community, startups have a proactive, pre-emptive patent protection strategy that can save them from costly patent troll litigation—to the tune of over $1 million per lawsuit. In addition, they have the opportunity to network and learn from some of the most respected tech companies in the world.

Alexandra Sepulveda, vice president of legal at Udemy Inc., said it best: “The last place you want to be is in front of your board answering questions about why you didn’t put preventative measures in place.”

Source :




DeepSense – Four Suggestions for Using a Kaggle Competition to Test AI in Business

For companies seeking ways to test AI-driven solutions in a safe environment, running a competition for data scientists is a great and affordable way to go – when it’s done properly.

According to a McKinsey report, only 20% of companies consider themselves adopters of AI technology while 41% remain uncertain about the benefits that AI provides. Considering the cost of implementing AI and the organizational challenges that come with it, it’s no surprise that smart companies seek ways to test the solutions before implementing them and get a sneak peek into the AI world without making a leap of faith.

That’s why more and more organizations are turning to data science competition platforms like KaggleCrowdAI and DrivenData. Making a data science-related challenge public and inviting the community to tackle it comes with many benefits:

  • Low initial cost – the company needs only to provide data scientists with data, pay the entrance fee and fund the award. There are no further costs.
  • Validating results – participants provide the company with verifiable, working solutions.
  • Establishing contacts – A lot of companies and professionals take part in Kaggle competitions. The ones who tackled the challenge may be potential vendors for your company.
  • Brainstorming the solution – data science is a creative field, and there’s often more than one way to solve a problem. Sponsoring a competition means you’re sponsoring a brainstorming session with thousands of professional and passionate data scientists, including the best of the best.
  • No further investment or involvement – the company gets immediate feedback. If an AI solution is deemed efficacious, the company can move forward with it and otherwise end involvement in funding the award and avoid further costs.

While numerous organizations – big e-commerce websites and state administrations among them – sponsor competitions and leverage the power of data science community, running a comptetion is not at all simple. An excellent example is the competition the US National Oceanic and Atmospheric Administration sponsored when it needed a solution that would recognize and differentiate individual right whales from the herd. Ultimately, what proved the most efficacious was the principle of facial recognition, but applied to the topsides of the whales, which were obscured by weather, water and the distance between the photographer above and the whales far below. To check if this was even possible, and how accurate a solution may be, the organization ran a Kaggle competition, which won.

Having won several such competitions, we have encountered both brilliant and not-so-brilliant ones. That’s why we decided to prepare a guide for every organization interested in testing potential AI solutions in Kaggle, CrowdAI or DrivenData competitions.

Recommendation 1. Deliver participants high-quality data

The quality of your data is crucial to attaining a meaningful outcome. Minus the data, even the best machine learning model is useless. This also applies to data science competitions: without quality training data, the participants will not be able to build a working model. This is a great challenge when it comes to medical data, where obtaining enough information is problematic for both legal and practical reasons.

  • Scenario: A farming company wants to build a model to identify soil type from photos and probing results. Although there are six classes of farming soil, the company is able to deliver sample data for only four. Considering that, running the competition would make no sense – the machine learning model wouldn’t be able to recognize all the soil types.

Advice: Ensure your data is complete, clear and representative before launching the competition.

Recommendation 2. Build clear and descriptive rules

Competitions are put together to achieve goals, so the model has to produce a useful outcome. And “useful” is the point here. Because those participating in the competition are not professionals in the field they’re producing a solution for, the rules need to be based strictly on the case and the model’s further use. Including even basic guidelines will help them to address the challenge properly. Lacking these foundations, the outcome may be right but totally useless.

  • Scenario: Mapping the distribution of children below the age of 7 in the city will be used to optimize social, educational and healthcare policies. To make the mapping work, it is crucial to include additional guidelines in the rules. The areas mapped need to be bordered by streets, rivers, rail lines, districts and other topographical obstacles in the city. Lacking these, many of the models may map the distribution by cutting the city into 10-meter widths and kilometer-long stripes, where segmentation is done but the outcome is totally useless due to the lack of proper guidelines in the competition rules.

Advice: Think about usage and include the respective guidelines within the rules of the competition to make it highly goal-oriented and common sense driven.

Recommendation 3. Make sure your competition is crack-proof

Kaggle competition winners take home fame and the award, so participants are motivated to win. The competition organizer needs to remember that there are dozens (sometimes thousands) of brainiacs looking for “unorthodox” ways to win the competition. Here are three examples

  • Scenario 1: A city launches a competition in February 2018 to predict traffic patterns based on historical data (2010-2016). The prediction had to be done for the first half of 2017 and the real data from that time was the benchmark. Googling away, the participants found the data, so it was easy to fabricate a model that could predict with 100% accuracy. That’s why the city decided to provide an additional, non-public dataset to enrich the data and validate if the models are really doing the predictive work.

However, competitions are often cracked in more sophisticated ways. Sometimes data may ‘leak’: data scientists get access to data they shouldn’t see and use it to prepare their model to tailor a solution to spot the outcome, rather than actually predicting it.

  • Scenario 2: Participants are challenged to predict users’ age from internet usage data. Before the competition, the large company running it noticed that there was a long aplha-numeric ID, with the age of users embedded, for every record. Running the competition without deleting the ID would allow participants to crack it instead of building a predictive model.

Benchmark data is often shared with participants to let them polish their models. By comparing the input data and the benchmark it is sometimes possible to reverse-engineer the outcome. The practice is called leaderboard probing and can be a serious problem.

  • Scenario 3: The competition calls for a model to predict a person’s clothing size based on height and body mass. To get the benchmark, the participant has to submit 10 sample sizes. The benchmark then compares the outcome with the real size and returns an average error. By submitting properly selected numbers enough times, the participant cracks the benchmark. Anticipating the potential subterfuge, the company opts to provide a public test set and a separate dataset to run the final benchmark and test the model.

Advice: Look for every possible way your competition could be cracked and never underestimate your participants’ determination to win.

Recommendation 4. Spread the word about your competition

One of the benefits of running a competition is that you get access to thousands of data scientists, from beginners to superstars, who brainstorm various solutions to the challenge. Playing with data is fun and participating in competitions is a great way to validate and improve skills, show proficiency and look for customers. Spreading the word about your challenge is almost as important as designing the rules and preparing the data.

  • Scenario: A state administration is in need of a predictive model. It has come up with some attractive prizes and published the upcoming challenge for data scientists on its website. As these steps may not yield the results it’s looking for, it decides to sponsor a Kaggle competition to draw thousands of data scientists to the problem.

Advice: Choose a popular platform and spread the word about the competition by sending invitations and promoting the competition on social media. Data scientists swarm to Kaggle competitions by the thousands. It stands to reason that choosing a platform to maximize promotion is in your best interest.


Running a competition on Kaggle or a similar platform can not only help you determine if an AI-based solution could benefit your company, but also potentially provide the solution, proof of concept and the crew to implement it at the same time. Could efficiency be better exemplified?

Just remember, run a competition that makes sense. Although most data scientists engage in competitions just to win or validate their skills, it is always better to invest time and energy in something meaningful. It is easier to spot if the processing data makes sense than a lot of companies running competitions realize.

Preparing a model that is able to recognize plastic waste in a pile of trash is relatively easy. Building an automated machine to sort the waste is a whole different story. Although there is nothing wrong with probing the technology, it is much better to run a competition that will give feedback that can be used to optimize the company’s short- and long-term future performance. Far too many competitions either don’t make sense or produce results that are never used. Even if the competition itself proves successful, who really has the time or resources to do fruitless work?

Source :


Day in the Life of a Marketing Analytics Professional

Marketing Analytics is a multifaceted but often misunderstood practice. Here’s an example day to highlight the diversity of the role .

Marketing Analytics is often the foundation of any world-class Marketing program. But conferences, interviews and meetings have taught me that very few people understand the world of Marketing Analytics.

Some incorrectly describe Marketing Analytics solely as digital analytics — tracking visits, clicks and conversions. Yes, we do that but that’s not all we do. I’ve heard others confuse Marketing Analytics with Market Research. I work closely with my Market Research colleagues but I don’t typically do research.Once, I had someone angrily tell me I was responsible for their ads in Spotify. I’ve never worked at Spotify.

The reason I love my job is that my day can vary from light SQL coding through to full-blown machine-learning algorithms. My role is diverse and has impact; my analyses drive drive multi-million dollar decisions. I have the opportunity to meet with everyone from the CFO to energetic interns. And I look at data across the ecosystem. I review data on every product area and deep-dive into the relationship between product behavior, demographic information and cultural trends. I describe this work as applied computational social science.

To lift the vale and shed some light on the Marketing Analytics profession, I’ve pulled together a ‘day in the life of a Marketing Analytics professional’. Projects and tasks have been condensed to 45 minute intervals to ensure I’m able to provide a representative overview of projects that I regularly tackle.

Welcome to my day as a Marketing Analytics practitioner.

7.30am —Running Resource-heavy SQL queries.

I like to get in early. Breakfast, coffee and SQL code. Databases run the fastest in the morning because there are fewer analysts running queries and sucking up computing resources. I kickstart a few large queries that need a lot of computing power. Run, you sweet database, you.

8.15am — Emails and Admin.

My inbox is taunting me. I support marketers all over the globe so I try to check my inbox early and answer questions, particularly from my EMEA colleagues who are just finishing their day.

9am — Cluster Analysis.

Clustering time. I pull up R (statistics software) and start coding. I’m looking for below-average users of a product for an education-based marketing campaign. Clustering is an ML (machine learning) technique to help me find ‘natural’ groupings of users. In this case, I’m using ML to identify the natural definition of ‘below-average’. My ML tool gives me clusters of high-to-low users based on certain metrics. I take the assigned definition(s) of users below average and that becomes my audience.

Cluster example; Source: CSSA

9.45am — Consultation with Marketing Colleagues.

I meet with marketing team members to help them define strategy and approach for an upcoming campaign. We talk about potential audiences, Key Performance Indicators (KPIs), strategies and budget. It’s fun. I enjoy the creativity and brainstorming.

10.30am — Reviewing Dashboards and Trends.

Back at my desk and it’s “dashboard review and email” time. I maintain a total of four automated dashboards for the marketing team. These dashboards cover demographics, marketing performance, regional data segmentation and marketing benchmarks. I aim to send out a short email every two weeks covering trends in the dashboards. Today, I’m reviewing and emailing an update about the regional data trends dashboard — looking at country-specific trends to help the in-country marketing teams.

Tableau example; Source: Analytics Vidya

11.15am — Campaign Results Analysis.

Results time. A two-month marketing campaign in France just finished and the team is looking for results. I pull up the media impression files and start the analysis. Almost all of our marketing is measured using a test and control format — test audiences receive marketing while a similar control audience receives no marketing. I use SQL and R to compare the behavior of the test and control groups. My goal is to see if there is a statistically significance difference between the groups on product behavior metrics. Results and learnings go into a summary report card document as well as our benchmarking database.

12 noon — Meeting with Product Analysts.

Lunch meeting. I meet with product data scientists who provide the latest trends they’re seeing and new data tables they’ve set up. Product analysts are responsible for understanding deep trends and nuances in a specific product area. They also build and maintain key data tables for their product area. As a marketing analyst, I’m responsible for looking at correlations and interaction across all product areas. So I rely on product analysts for their in-depth insights and use their various data tables to measure the marketing impact on user behavior. We talk Hive tables and trends.

12.45pm — Working Session to Build a Machine Learning Tool.

I meet with the data engineer on my team. We’re building a machine-learning tool that will automate sections of our marketing campaigns to help us test and learn at scale. It’s an exciting project. We spend the time talking algorithms, translating my R code into Python and figuring out the databases we need to set up. Tech nerds unite.

1.30pm — Predicting ROI.

Back at my desk. My next project is focused on predicting the ROI (return on investment) of a planned marketing campaign. The goal is to figure out if the campaign is worth the investment. If I give the green light, millions of dollars will be committed and three different teams will work on this project — so I make sure I check my numbers carefully. I look at metrics such as potential reach, estimated click-through rate and approximate CPAs (cost-per-acquisition). I’m balancing the need to quickly reply with accuracy . This juggling act is a common ailment of the modern marketing analyst. I use benchmarks and multivariate regression to run the prediction.

2.15pm — Geographic Mapping of Activity.

Location, location, location. I shift my focus to geography and mapping. Some teammates have been given the green light to create a program of ‘pop-up’ events in cities around the UK. I’ve already pre-selected the cities in our planning and strategy sessions . Now, they need to know specifically where in the cities they should run their event. They ask for locations that have lot of lunchtime foot traffic. I use location data with mapping functionality in R (I like using Leaflet mapping API) to create heat maps of foot traffic by location. I pull out the top three locations and send it to the marketing team. I love this project, I’m a nerd for a good heat map.

Heat map example; Source: Stack Overflow

3pm — Writing a Measurement Plan.

Next up, I need to write the measurement plan portion of a marketing campaign. All campaign plans are required to include a measurement section that outlines KPIs, secondary metrics, goals, target audiences, geographies and measurement approach. My research colleagues will also add to this section if there’s an awareness or sentiment research component. Our organization a strong culture of test and measure here so analytics and research (if applicable) sign off on all marketing plans.

3.45pm — Emails Round Two.

It’s nearing the end of the day. I take a moment to check my emails and answer questions from the day — or outstanding questions I’ve not answered from previous days. I receive a lot of questions throughout the day which, over time, has taught me to be succinct and direct in my replies. I lean into active voice as much as I can. I’ve all but removed the fluff from my emails these days because there’s simply not enough time. I save the chit-chat for in-person conversations and coffee meet-ups.

4.30pm — Review Data Trends in Japan.

My last meeting of the day — I’m talking behavior trends with our Country Marketing Manager for Japan. He’s a passionate, funny guy who loves to dive into the numbers. I try to meet with the country marketing managers every two months to give them an update on what I’m seeing in the dashboards. It’s also a great opportunity to hear from them about important issues, areas I can help provide data insights and their promotional plans. I pull up the interactive dashboard I’ve built for Japan and we chat data. Tableau is my go-to for small data sets but it often can’t handle the data table I work with so I also use a dashboard tool developed internally.

5.15pm — Setting up SQL Queries to Run Overnight.

It’s getting to the end of my day. My last task to set up some SQL queries to run overnight that will create data tables for tomorrow’s analyses. I run them overnight because there are more server resources available after people go home. And when you’re dealing with tables that have billions and trillions of data rows, you need all the server resources you can wrangle.

As you can see, the Marketing Analytics profession goes far beyond clicks and conversions. It’s a mixed bag of dashboards, algorithms, coding and internal consultation. This variety is what I enjoy about the profession. Weeks are rarely the same.

Source :

Scroll to top