- ‣ Google's cloud business lost over $ 5.5 billion last year
- ‣ Microsoft Azure Functions Vulnerable to Docker Escape Bug
- ‣ Pinecone, a serverless vector database for machine learning, leaves stealth with $10M funding
- ‣ Researchers detect new malware targeting Kubernetes clusters to mine Monero
- ‣ GitLab Changes its Pricing Plan; Drops Starter Tier
- ‣ Microsoft Security Business Surpasses $10 Billion in Revenue
- ‣ Researchers uncover a 10-year old vulnerability in Linux
- ‣ IBM Introduces New Cloud Pricing
- ‣ AWS to offer free eight-week training
- ‣ IBM acquires cloud consultancy firm - Taos Mountain
- ‣ Driftctl: A Tool to detect Infrastructure Drifts
- ‣ New Work From Home Expansion From OpsRamp Network.
- ‣ AWS announces forks of Elasticsearch and Kibana
- ‣ CockroachLabs Secures $160M to Grow Their Distributed SQL Database
- ‣ AWS Unveils The New ML-Powered Amazon DevOps Guru
- ‣ Grafana Adds A Free Tier To Its Cloud Observability Platform
- ‣ Sysdig Report Says 58% Of Container Images Run As Root
Pinecone, a serverless vector database for machine learning, leaves stealth with $10M funding
Feb. 18, 2021, 1:36 a.m. in Machine Learning
Topline
Pinecone, a new startup from the people who helped launch Amazon SageMaker, has created a vector database that generates data in a specialized format to build faster machine learning applications, something previously only accessible to computers, and Larger organizations. Today, the company came out of caution with a new product and announced an initial investment of $10 million led by Wing Venture Capital.

Key Facts
Contains all the data structures and algorithms that allow them to index large amounts of high-dimensional vector data
Converts data into the machine learning format
Pinecone is created to make technology available to any business
Vectors are ubiquitous in machine learning
Details
Edo Liberty, the company's co-founder, says he founded the company out of this fundamental belief that the industry was being held back by the lack of broader access to this type of database.
"The data that a machine learning model expects is not a JSON record, it is a high-dimensional vector that is a list of characteristics or what is called embedding, which is a numerical representation of the elements or objects of the world. This format is much more semantically rich and actionable for machine learning," he explained.
He says this is a concept widely understood by data scientists and supported by research. Still, until now, only the most extensive and technically superior companies like Google or Pinterest could take advantage of this difference.
Liberty and his team created Pinecone to make this kind of technology available to any business.
The startup spent the last few years building the solution, which consists of three main components, the main piece is a vector engine to convert the data into this ingestible machine learning format.
Liberty says that this is the piece of technology that contains all the data structures and algorithms that allow them to index substantial amounts of high-dimensional vector data and search through it efficiently and accurately.
The second is a cloud-hosted system to apply all of that converted data to the machine learning model while handling things like index lookups and pre and post-processing - everything a data science team needs to run a machine learning project scale, with very high workloads and throughputs.
There is a management layer to track all of this and manage data transfer between source locations.
A classic example Liberty uses is an e-commerce recommendation engine. While this has been a standard part of online sales for years, he believes that using a vectorized data approach will give much more accurate recommendations. He says that data science research data confirms this.
"It used to be that implementing something like a recommendation engine was actually incredibly complex and if you have access to a production-grade database, 90% of the difficulty and heavy lifting in creating those solutions disappear, and that is why we are building this. We believe it is the new standard," he said.
Finally, Pinecone has its language and supports the type of CRUD operations typical of databases.
However, it doesn't use SQL-clone typical of other forms of databases. How then do you get documents created after a particular data that has a type of keyword?