6 min read

Audit Data Analytics Part1

Introduction

Accounting professionals tend to work under pressure so that they can cover as much ground as possible. However, tools at their disposal often have very limited power and are unable to optimistically fit their workflow. Furthermore, their work is prone to errors due to too many copy+paste and unreproduciblity.

In my view, R is a perfect tool to increases their productivity and conform to their workflow. Watch Hadley Wickham on Data Science Challenges and Data visualization and data science, Hadley Wickham to learn more about it.

Data analytics lifecycle described by R4DS sensibly matchs audit routines. This document is meant to demonstrate the usefulness of R in the field of audit via a specific use case. Visit my github repo for project details.

Data preparation

Data scientists spend 80% of their time getting the data into desired shape and only 20% of their time on actual data analysis.

The case study assumes that auditors are at the stage of audit planning and will use R to perform audit procedures. To obtain data, auditors will typically go through media coverage, company announcements and Genernal Ledger (Database or PBC). R has excellent tools for webscrapping, pdf table extraction and SQL queries.

Unless there are unknown bank accounts, all company transactions will be recorded in G/L. G/L contains valuable information for auditors to perform risk assessment and preliminary analytical procedures efficiently and effectively. Unfortunately, auditors often have difficulties in cleaning G/L and extracting information out of G/L. This is particularly true in the case of JV memo or description. Using R to solve those issues would be like a breeze.

Dataset

  # A tibble: 10 x 14
        id account subaccount type  date       num   name  memo  split debit
     <dbl> <chr>   <chr>      <chr> <date>     <chr> <chr> <chr> <chr> <dbl>
   1     2 Compan~ Company C~ Check 2018-01-01 5001  Kuve~ Jana~ Rent      0
   2     3 Compan~ Company C~ Check 2018-01-01 5000  Onli~ Set ~ Acco~     0
   3     4 Compan~ Company C~ Depo~ 2018-01-01 <NA>  <NA>  Depo~ -SPL~ 31349
   4     5 Compan~ Company C~ Check 2018-01-05 5002  Debo~ Pett~ Pett~     0
   5     6 Compan~ Company C~ Check 2018-01-08 5003  Gas ~ ACCT~ Util~     0
   6     7 Compan~ Company C~ Payc~ 2018-01-15 3001  King~ <NA>  -SPL~     0
   7     8 Compan~ Company C~ Check 2018-01-15 5007  Auto~ Car ~ Car ~     0
   8     9 Compan~ Company C~ Payc~ 2018-01-15 3003  Boli~ <NA>  -SPL~     0
   9    10 Compan~ Company C~ Payc~ 2018-01-15 3004  Deen~ <NA>  -SPL~     0
  10    11 Compan~ Company C~ Check 2018-01-22 5011  Offi~ Acct~ Supp~     0
  # ... with 4 more variables: credit <dbl>, balance <dbl>, weekday <ord>,
  #   month <ord>

Data dictionary

Completeness test

Chart of Accounts (COA)

Control total

“b/f + current year = c/f” cannot be performed due to the lack of prior year.

  # A tibble: 1 x 2
       debit   credit
       <dbl>    <dbl>
  1 2136029. 2136029.

Range of date

  [1] "2018-01-01" "2018-12-31"

Missing values

Round numbers

  # A tibble: 36 x 8
        id account      subaccount      type    date       num   debit credit
     <dbl> <chr>        <chr>           <chr>   <date>     <chr> <dbl>  <dbl>
   1    27 Company Che~ Company Checki~ Check   2018-01-31 5025      0   1000
   2    40 Company Che~ Company Checki~ Transf~ 2018-02-09 <NA>      0  10000
   3   101 Company Che~ Company Checki~ Check   2018-03-26 5075      0  10000
   4   172 Company Che~ Company Checki~ Check   2018-05-25 5124      0  12000
   5   210 Company Che~ Company Checki~ Bill P~ 2018-06-29 5156      0   1000
   6   226 Company Che~ Company Checki~ Transf~ 2018-07-20 <NA>      0  10000
   7   240 Company Che~ Company Checki~ Check   2018-07-25 5179      0  10000
   8   272 Company Che~ Company Checki~ Check   2018-08-22 5206      0   3000
   9   279 Company Che~ Company Checki~ Check   2018-08-27 5214      0  12000
  10   320 Company Che~ Company Checki~ Bill P~ 2018-10-05 3079      0   6000
  # ... with 26 more rows

999 numbers

  # A tibble: 11 x 8
        id account       subaccount       type  date       num   debit credit
     <dbl> <chr>         <chr>            <chr> <date>     <chr> <dbl>  <dbl>
   1   403 Company Chec~ Company Checkin~ Payc~ 2018-12-15 5365    0    340. 
   2   883 Inventory As~ Inventory Asset  Invo~ 2018-04-16 71087   0     49.0
   3   971 Inventory As~ Inventory Asset  Invo~ 2018-05-16 71095   0     98.0
   4  1029 Inventory As~ Inventory Asset  Invo~ 2018-05-24 71086   0     49.0
   5  1082 Inventory As~ Inventory Asset  Invo~ 2018-06-07 71085   0     49.0
   6  1495 Inventory As~ Inventory Asset  Invo~ 2018-12-10 71105   0     98.0
   7  4027 Purchases  (~ Purchases  (Cos~ Invo~ 2018-04-16 71087  49.0    0  
   8  4112 Purchases  (~ Purchases  (Cos~ Invo~ 2018-05-16 71095  98.0    0  
   9  4170 Purchases  (~ Purchases  (Cos~ Invo~ 2018-05-24 71086  49.0    0  
  10  4220 Purchases  (~ Purchases  (Cos~ Invo~ 2018-06-07 71085  49.0    0  
  11  4615 Purchases  (~ Purchases  (Cos~ Invo~ 2018-12-10 71105  98.0    0

JV posted on weekends

Top 10 JVs posted on weekends are;

  # A tibble: 10 x 3
     type            account                        n
     <chr>           <chr>                      <int>
   1 Invoice         Revenue                      165
   2 Invoice         Inventory Asset              164
   3 Invoice         Purchases  (Cost of Goods)   164
   4 Paycheck        Payroll Liabilities          153
   5 Paycheck        Wages                         76
   6 Deposit         Undeposited Funds             52
   7 Liability Check Payroll Liabilities           44
   8 Paycheck        Company Checking Account      19
   9 Invoice         Accounts Receivable           18
  10 Paycheck        Direct Labor                  17

Monthly accumulated FS

  # A tibble: 6 x 7
    subaccount                    Jan     Feb     Mar    Oct    Nov     Dec
    <chr>                       <dbl>   <dbl>   <dbl>  <dbl>  <dbl>   <dbl>
  1 Accounting Fees             225        0       0   2544      0       0 
  2 Accounts Payable         -35903   -45421  -21732. -4438. -6517. -16697.
  3 Accounts Receivable       22443.   10455.  16399. 35194. 16227.  32333.
  4 Accumulated Depreciation    -76.9   -154.   -231.  -769.  -846.   -923.
  5 Advertising Expense         500        0       0   2000      0       0 
  6 Business License & Fees       0      710.      0      0      0       0