For faster services, inquiry about new assignments submission or follow ups on your assignments please text us/call us on +1 (251) 265-5102
Part 1: Probability and Bayes
The company labels each article with a flag that determines whether it is fake or not. A group of experts and independent journalists studied a large sample of articles labeled by the company to determine the accuracy of the company’s labeling system. The group studied 9, 821 articles and found 932 articles containing fake news. Of these 932 articles, 886 were labeled as fake by the company. Given this information, answer the following questions:
The company wants to advertise the accuracy of its fake news detection mechanism. As such, they want to calculate the probability that an article is labeled as fake (company’s label), given that it contains fake news (experts’ assessment). What is this probability? Based on this measure, can the company claim its accuracy is greater than 95%?What is the probability that a randomly selected article is labeled fake by the company and experts?The company knows that of all the 9, 821 articles in this sample, 1, 370 of them were labeled as fake news by the company’s fake news detection system. Given this information, what is the probability that a randomly selected article is labeled with a fake news flag either by the company or the experts?Conditional on the fact that an article is labeled by the company as fake, what is the proba- bility that the article contains fake news based on the experts’ assessment? Is this probability different from the probability in Q1? Why?
Part 2: Discrete Distributions
The company offers a menu of different order sizes and prices per order as presented in Table 1. For example, if the client chooses the order size of 10 at a price of $14.99 per order, it means that they want 10 articles to be fact-checked, and they are willing to pay $14.99 for each article, which makes a total revenue of $149.90 for the company. The company has a client named MNC, which is a news aggregator website that places orders daily. Given the uncertainty in the number of articles that require fact-checking every day, the menu item ordered by MNC has some variability. The data analytics team at the company used the data of past orders from this client and created the following table that shows the probability that the client would choose each menu item. Given the information in this table, answer the following questions
Define a random variable that is equal to the number of orders placed by MNC per day. What is the expected value, variance, and standard deviation of this random variable?Define a random variable that is equal to the daily revenue generated by the MNC order. What is the probability that the daily revenue exceeds $200? What is the expected value, variance, and standard deviation of this random variable?
Part 3: Normal Distribution
As discussed earlier, the company uses crowd-sourcing techniques for fact-checking, where it recruits a group of individuals to perform the task of reviewing the articles and assigning veracity scores. Based on the past information, the company knows the time it takes a random individual to complete the task follows a Normal distribution with a mean of approximately 18 minutes and a standard deviation of 4 minutes. Given the information, answer the following questions:
The company wants to know the following probabilities:What is the probability that an individual completes the task in less than 15 minutes?What is the probability that an individual completes the task in more than 15 minutes?What is the probability that the time it takes an individual to complete the task is between 20 and 25 minutes?Suppose we have two individuals, Alice and Bob, who perform the same task without interacting with each other. What can we say about the probability that Bob performs the task in less than 14 minutes, and Alice performs the task in more than 21 minutes? Why?The company wants to identify individuals who take a very long or a very short time to
complete the task. They want to identify the top 5% and the bottom 5% of the individuals.
Company wants to identify individuals who take a very long or a very short time to
What is the cutoff for the task completion time that determines the bottom 5%?What is the cutoff for the task completion time that determines the top 5%?Are these cutoffs different from the cutoffs for outliers in a box plot? (Hint: You need to calculate the first quartile Q1, the third quartile Q3, and interquartile range IQR. Given these inputs, cutoffs for the outliers in a box plot are Q1−1.5×IQR and Q3+1.5×)
Part 4: Simulation and Optimal Decision-making
The company has a list of email addresses of potential fact-checkers who can review articles and receive a payment per task. For every order, the company sends an email to a selected subset of these fact-checkers. The email contains a link that directs the fact-checker to the article they need to review. When a fact-checker clicks on the link, they are assigned to review an article that isnot reviewed.
Each fact-checker is paid $8 once the task is completed. Given the size of an order, the company needs to decide on the number of email recipients. There are two situations that the company must consider when making this decision:
GET ALL YOUR ACADEMIC HELP AT ESSAYLINK.NET