In 2017, TikTok’s parent company, ByteDance, scraped short-form videos, usernames, profile pictures, and profile descriptions from Instagram, Snapchat, and other sources and then uploaded them — without users’ knowledge or consent — to Flipagram, a TikTok predecessor, according to four former employees of the company.
BuzzFeed News spoke with the four former ByteDance employees, all of whom worked on Flipagram (later renamed Vigo Video), and viewed internal documents that indicate the scraping was run by an engineering team in China and began soon after ByteDance acquired Flipagram in January 2017. The former employees described the project as one of several “growth hacks” — including the manipulation of like and video view statistics — employed by the company. One of the former employees said the scraping affected hundreds of thousands of accounts, and a document viewed by BuzzFeed News detailed plans to “crawl video > 10k/day in P0 countries” — according to the former employee, this meant the team’s goal was to scrape more than 10,000 videos a day in the highest priority countries. The former employees spoke to BuzzFeed News under the condition of anonymity because they feared retribution from ByteDance.
The former employees do not know when the scraping they say they were aware of stopped. Two of them say that the scraped content was used to train ByteDance’s powerful “For You” personalization algorithm on US-based content so that it would better reflect the preferences of US users. Today, the “For You” algorithm powers both TikTok and its Chinese equivalent, Douyin. (Disclosure: In a previous life, I held policy positions at Facebook and Spotify.)
BuzzFeed News sent ByteDance a comprehensive list of the allegations we intended to print in this article as well as a detailed set of questions, including if data sets from Flipagram were ever used to train the “For You” algorithm that powers TikTok today or to train any other algorithms currently in use by ByteDance.
In response, ByteDance spokesperson Jennifer Banks wrote back two sentences: “ByteDance acquired Flipagram in 2017 and operated it, and subsequently Vigo, for a short time. Flipagram and Vigo ceased operations years ago and aren't connected to any current ByteDance products.”
Flipagram founder and former CEO Farhad Mohit did not respond to requests for comment, nor did his cofounders Raffi Baghoomian and Joshua Feldman. BuzzFeed News did reach Brian Dilley, who was Flipagram’s chief technology officer until October 2017, at his home. When asked whether the company had been scraping and reuploading content in 2017, he replied, “No, in fact I’m positive we were not.” He then ended the interview. BuzzFeed News sent Dilley a follow-up email asking for him to elaborate on his answer and explain his understanding of what was happening at the time. Dilley reiterated that the company had not scraped other platforms during his time there.
The documents reviewed by BuzzFeed News include explicit references to scraped content and the use of fake accounts. In one document, an employee lays out the reasons that the company used “fake accounts” and scraped content; among them were that the accounts could be used to test which content performed best on the platform, and that current users could mimic the scraped content to improve their own popularity. In another document, a different employee explains that a certain account had been scraped and copied onto Flipagram from Instagram. A third document lists account scraping as an “OKR” (‘objective and key result’) for an engineering team in China.
According to the documents, ByteDance began copying content from some of its China-focused short-form video apps and uploading it to Flipagram through fake accounts in early 2017. One document details how the company tried to curate content that was “not too Chinese” and would resonate with US users, but three of the former employees say the content still didn’t perform well with Flipagram’s user base.
In mid-2017, according to the four former employees, ByteDance began scraping and reuploading content from the US. Three of the former employees, and one of the documents, identify Instagram as a source of the scraped content. Two of the former employees remember the company scraping and uploading content from Snapchat and Musical.ly — an app popular with tweens and teens that ByteDance acquired in late 2017, and that would eventually become TikTok.
One of the former employees who identified Snapchat and Musical.ly as sources of the scraping did not identify Instagram as one. This person expressed doubt that the platform was scraped because at least some Instagram videos at the time were square in shape, and videos in the Flipagram app were not. However, another former employee told BuzzFeed News that they recalled conversations about resizing videos and removing watermarks placed on content by other platforms, so that users could not tell that the scraped content originated elsewhere.
Instagram’s and Snap’s terms of service forbade scraping in 2017, as they do today. At the time, Musical.ly’s terms of service prohibited users from “mak[ing] unauthorized copies of any content made available on or through” the platform.
Jason Grosse, a representative for Instagram’s parent company Meta, said the company would not comment at this time. Russ Caditz-Peck, a spokesperson for Snap, said, “Our Terms of Service prohibit scraping and reposting public content from our services, and we implement defenses to limit such attempts.”
In other circumstances, allegations that companies have scraped and reused content without permission have spurred litigation, both by companies and individuals who made the content in question. (Scraping, or crawling, which simply means using a computer to copy information at scale, can also be an invaluable research tool for researchers and journalists seeking to better study and analyze public content.) Companies that have used fake accounts to lure users to their platforms have also been sued by state and federal regulators for deceptive business practices.
Some people noticed that their content had been uploaded to Flipagram without their knowledge or consent, according to the four former employees and complaints made on Twitter. The four former employees told BuzzFeed News that the company received emails from creators who said they were being impersonated on the app. Two of those people recall inquiries from parents asking why their children’s content was on a platform that neither they nor their children had ever heard of. The four sources said employees were instructed to delete the offending accounts or give the person complaining control over them, and tell the complaining creators that Flipagram cannot prevent a user (or fan) from uploading someone else’s content.
The former employees also described other “growth hacks” that ByteDance used to try to make Flipagram popular in 2017. According to three of the former employees, the company manipulated like and video view counts displayed in the app to make creators believe they were more popular than they were. “One like was not one like,” said a former employee who witnessed the manipulation. (Facebook has faced similar allegations that it knowingly inflated video view metrics to increase advertising revenue, which it has disputed.)
According to an internal document, ByteDance also capped video views from scraped content at a certain level; one of the former employees explained this was so that scraped content views would not overwhelm content posted by real Flipagram users. Additionally, according to two sources, Flipagram limited how frequently it would recommend “cross-posts” — content posted first to other platforms, and then reposted to Flipagram — to incentivize creators to post content first to Flipagram and only later to other platforms.
ByteDance did not respond to questions about manipulation of metrics and recommendations practices for Flipagram.
One former employee portrayed ByteDance’s growth tactics as a symptom of a larger, industry-wide obsession with growth at any cost. "The US public and US media often attribute unethical growth strategies practiced by Chinese tech companies to ‘Chinese tech culture,’ when very often those tactics are directly copied from FAANG companies," they said, using an acronym for the American tech giants Facebook, Amazon, Apple, Netflix, and Google. Invoking Steve Jobs’ famous quote that “great artists steal,” and Mark Zuckerberg’s controversial axiom “move fast and break things,” this person continued: “Chinese tech culture is not the enemy. Chinese tech culture is an honest mirror.”
Flipagram was founded in Los Angeles by Farhad Mohit in 2013 as a photo collage and short-form video app. It attracted a young audience — largely teens and tweens — and was once viewed as a threat to Instagram. In January 2017, it was acquired by ByteDance’s news aggregator app, Toutiao, which later rebranded it as Vigo Video. Later that year, ByteDance also acquired the lip-synching app Musical.ly, one of Flipagram’s key rivals.
For a while, staff for the two apps worked alongside one another in Flipagram’s open-plan office building in Los Angeles. The former employees described the period as awkward; as one former employee put it, the teams’ history of competition “led to an uncomfortable and very uncollaborative energy in the workplace.” The products, the source said, “were so similar I don’t think anyone felt like ByteDance was going to put their funding fully behind both.” In February 2018, ByteDance laid off members of the LA-based Flipagram team. Months later, it rebranded Musical.ly as TikTok.
The relationship between Flipagram and TikTok is described differently by different people. On his website and LinkedIn profile, Flipagram founder Mohit describes Flipagram as “now TikTok.” Flipagram’s company profile on LinkedIn describes it the same way. But when ByteDance rolled out TikTok in the US, it was Musical.ly users, not Flipagram users, who opened their apps to a new name, logo, and experience.
ByteDance also did not answer questions from BuzzFeed News about where and how it stored any data it allegedly scraped from Instagram and other platforms. TikTok has undergone a massive initiative in the past year to isolate data from users inside the US in an effort to quell regulators’ fears that the data could be accessed by the Chinese government. But it is unclear whether data from Flipagram — including the allegedly scraped data — was ever kept in data centers in China, or whether it remains there today.
When reached for comment by BuzzFeed News about the alleged scraping, Sen. Richard Blumenthal called on regulators to investigate: “The FTC must swiftly investigate ByteDance’s alleged theft of data from Instagram and Snapchat users — including kids and teens — to deceive the public and boost their algorithm. This type of wrongful and greedy corporate conduct only underscores the urgent need for Congress to pass stronger kids’ privacy and safety legislation.”
This is not the first time ByteDance has been accused of controversial intellectual property practices. Last year, competing Chinese tech giant Tencent filed numerous claims against ByteDance for alleged copyright infringement on its Douyin app. Audiovisual software company Beijing Meishe Network Technology Co. also filed a suit alleging that the company stole, and removed copyright restrictive language from, proprietary code. (ByteDance did not respond to a request for comment on either of the suits.) The company has also faced privacy lawsuits in the past: ByteDance agreed to pay $92 million last year to settle a lawsuit alleging that the company harvested biometric information from TikTok users without permission. When asked for comment by the Associated Press at the time, TikTok provided the following statement: “While we disagree with the assertions, rather than go through lengthy litigation, we’d like to focus our efforts on building a safe and joyful experience for the TikTok community.”
Flipagram had a fraught history with intellectual property too, even before ByteDance acquired it. In 2016, CEO Farhad Mohit admitted that the company had initially allowed users to create content using music that the platform did not have the right to play. In an interview with Recode at the time, Mohit revealed his thinking on bending rules in search of growth.
“We did it kind of like entrepreneurs do sometimes, we kind of just did it and [decided] we’d ask for permission after.”