Information Sources, Data Capture, and Ethics of Use
Learning objectives
By the end of this chapter you should be able to:
- Distinguish between primary and secondary data, and between internal and external information sources, in the context of management decisions.
- Assess information sources for reliability, timeliness, bias and total cost (including collection, processing and control).
- Build a practical data-capture plan that specifies what to capture, how often, who owns it, and what quality checks apply.
- Apply ethical and governance principles when collecting, storing, analysing and reporting information.
- Explain how data quality and ethical choices can affect reported figures and business decisions.
Overview and key concepts
Management decisions depend on information: pricing, costing, budgeting, performance measurement and investment appraisal all rely on data that is fit for purpose. “More data” is not automatically “better data”. The usefulness of information is shaped by:
- Source(where it comes from)
- Method(how it is captured and processed)
- Quality(how accurate, complete and comparable it is)
- Controls and ethics(whether it is handled lawfully, fairly and securely)
This chapter explains how to classify information sources, design a workable capture process, and apply governance and ethical safeguards—while keeping a clear line of sight to how weak information can flow through to costing outputs and, ultimately, to reported figures.
Classification of information sources
Information sources are commonly grouped in two ways:
- Internal vs external: generated inside the organisation or obtained from outside parties.
- Primary vs secondary: collected first-hand for the current purpose or reused from earlier collection for a different purpose.
These categories overlap. For example, an internally run staff time survey is internal and primary, while last year’s customer returns database is internal and secondary for a new quality-improvement project.
Primary and secondary data
Primary data
Primary data is gathered specifically to answer the current question. Because it is designed around the decision, it can be targeted and timely. Examples include:
- A short customer survey to identify why orders are cancelled
- Direct observation of machine downtime for a maintenance review
- A one-off study of delivery times by route for a logistics redesign
Strengths
- Can be tailored to the decision and definitions can be controlled (what counts as “delay”, “defect”, etc.)
- More likely to capture exactly what is missing from existing records
Limitations
- Typically more expensive and slower to obtain
- Can be distorted by poor sampling design or leading questions
Secondary data
Secondary data already exists, having been collected for another purpose. Examples include:
- Prior-period job costing files
- Published market statistics
- Supplier catalogues and industry price lists
- Historic warranty and returns information
Strengths
- Usually cheaper and quicker
- Useful for trend analysis and benchmarking
Limitations
- Definitions may not match the current decision (e.g. different product groupings)
- It may be out of date, incomplete, or shaped by the original purpose
Internal and external data
Internal data
Internal sources include records created through everyday operations, such as:
- Sales invoices, credit notes and order records
- Labour time records, payroll summaries and overtime logs
- Materials requisitions, inventory movements and scrap reports
- Machine usage data, maintenance logs and production reports
Internal information is often operationally relevant and can be linked to specific processes and accountability.
External data
External sources provide market context and independent reference points, for example:
- Competitor pricing and product features
- Supplier quotations and lead times
- Macroeconomic indicators (inflation, interest rates, exchange rates)
- Regulatory requirements, industry standards and professional guidance
External data can improve decision quality, but it must be checked carefully for comparability, motivation and credibility.
Data governance and quality
Data governance
Data governance is the set of rules, responsibilities and controls that determine:
- Who owns the data(accountability for definitions and accuracy)
- Who can access it(need-to-know permissions)
- How it is stored(security, retention, backups)
- How changes are controlled(version control, approvals, traceability)
Strong governance supports consistent reporting and reduces the risk of error, manipulation and unauthorised use.
Judging whether data is “good enough”
Before using a dataset, decide what would make it unsafe for the decision. In practice, most problems fall into a small number of checks:
- Truth check: do the figures reflect what actually occurred in operations (not what people intended to record)?
- Coverage check: are any transactions, time periods, locations or products missing from the dataset?
- Definition check: are codes and labels applied the same way across teams and over time (so like is compared with like)?
- Deadline check: will the information arrive early enough to influence the decision, not just explain it afterwards?
- Rule check: do entries comply with basic input rules agreed in advance (for example, valid dates, sensible ranges, mandatory fields)?
- Duplication check: has anything been captured twice, merged incorrectly, or replicated across systems?
Set minimum acceptable standards in advance (for example, “no duplicate job numbers” or “timesheets submitted by 10am next day”), then test the data against that standard.
Ethics, confidentiality and lawful processing
Core ethical principles in practice
Information is valuable, but it can also be misused. A practical ethical framework for data capture and use can be anchored on five core principles:
- Integrity: records are complete and honest; time, quantities and costs are not deliberately misstated to protect targets or budgets.
- Objectivity: analysis is not designed to force a preferred outcome; assumptions and exclusions are stated, and adverse findings are not hidden.
- Professional competence and due care: data is collected using suitable methods, validated appropriately and interpreted with proper technical skill.
- Confidentiality: sensitive information is protected and shared only with authorised users for legitimate purposes.
- Professional behaviour: legal requirements, internal policies and contractual obligations are followed; data is not used in ways that would undermine trust.
These principles apply directly to common management accounting tasks such as setting costing rates, evaluating performance, and preparing profitability analyses.
Confidentiality
Confidentiality means restricting disclosure of sensitive information, such as personal data (customers and staff), commercially sensitive pricing, margins, supplier terms and non-public performance data. Loss of confidentiality can create legal exposure, penalties, reputational damage and lost competitive advantage.
Lawful processing, consent and data minimisation
Where personal information is processed, the organisation needs a proper basis for doing so and must be transparent about purpose. Consent is only one possible basis and it is not always appropriate—particularly in employment settings where power imbalance can undermine “freely given” consent. In many business contexts, other bases may be more suitable (for example, performance of a contract, legal obligations or legitimate organisational needs, depending on circumstances).
Data minimisation reduces risk: collect only what is needed for the stated purpose, restrict access, and retain it only as long as required.
Audit trail and information security
Audit trail
An audit trail is the evidence that shows how a figure was produced. It should allow a reviewer to trace:
- the original source
- key transformations (for example, allocations or exclusions)
- approvals and changes
- final outputs used in decisions or reports
A strong audit trail improves accountability and helps resolve disputes.
Information security
Information security protects data against loss, corruption and unauthorised access. Typical measures include access control, encryption, secure transfer, routine backups and tested recovery procedures, monitoring of unusual access or changes, and training to reduce human error.
Core theory and frameworks
A decision-led way to choose information
Start with the decision deadline, then work backwards:
First, write down what must be decided and what “success” looks like (for example, reduce under-billing without increasing disputes). Next, identify the few measures that genuinely drive that decision and define them tightly. Only then select sources—internal or external, new collection or re-use—based on which option can meet the deadline with acceptable reliability.
Build in failure-mode checks early: where could bias creep in, where could coding go wrong, and what would an independent reviewer challenge? Finally, price the full cost of the information (collection effort, processing time, controls, storage and access management) and record limitations so users do not over-interpret the outputs.
Mini-example (decision lens)
If a manager needs next month’s staffing plan, perfect data in six weeks is less useful than “good-enough” data by next Tuesday. The capture plan might prioritise rapid, consistent time coding and exception reporting over collecting every optional detail.
Designing a practical data-capture plan
A practical capture plan can be built around the data lifecycle: capture → validate → correct → approve → report → retain.
- Capture: specify required fields and definitions (e.g. job number, task code, hours, grade, date). Decide whether capture is automated, manual, or hybrid.
- Validate: build checks into the process (mandatory fields, sensible ranges, valid dates, duplicate detection, authorisation limits).
- Correct: define how errors are fixed, who can amend records, and what evidence is needed.
- Approve: specify who reviews exceptions and signs off corrections (segregation of duties where possible).
- Report: define outputs (job cost summaries, variance reports, utilisation KPIs) and who receives them.
- Retain: set retention periods, access rights and security measures, including backup and recovery.
Pilot testing is often cost-effective: it reveals practical issues before full rollout.
Bias and misclassification
Bias can enter through sampling, measurement, classification or interpretation. It is managed through clear definitions, balanced sampling (where relevant), independent review and transparent disclosure of assumptions.
Misclassification is a frequent operational failure, for example:
- labour hours recorded to the wrong job
- materials issued to a default code rather than the actual product
- supplier invoices mis-coded between direct and indirect costs
- customer receipts posted to the wrong account
These issues can look minor individually but become significant when aggregated, distorting product margins and decision-making.
Cost–benefit analysis of data improvements
Improving data capture often has both cash costs (subscriptions, devices) and time costs (admin hours, training, review time). Benefits may include:
- reduced revenue leakage (more recoverable billing)
- fewer disputes and less rework
- better pricing and quotation accuracy
- improved capacity planning and scheduling
- stronger control and reduced error rates
Only incremental costs and benefits should be included.
Integrating data into management accounting outputs and reported figures
Data flows through the organisation in stages:
- Raw capture(time, materials, sales orders, receipts)
- Management accounting outputs(job costs, product profitability, budgets, KPIs, performance reports)
- Operational decisions(pricing, staffing, outsourcing, process improvement)
- Reported figureswhere management information feeds accounting estimates, classifications or cut-off assessments
When capture is weak, the immediate damage is often to costing accuracy and KPIs; the wider risk is that poor records also undermine estimates and classifications used in reporting (for example, receivable impairment estimates or contract balances).
Information quality and reported figures
Weak information can lead to errors in recognition, measurement and classification. Common routes include:
- Cash vs credit transactions: revenue is recognised based on when goods or services have been provided, not simply when cash is collected—so weak customer/order data can distort receivables, credit control and forecasts.
- Operating expenses vs inventory/cost of sales: misclassifying costs can inflate inventory or understate expenses, distorting profit and margins.
- Deferred income and contract liability: cash received in advance may represent an obligation to provide goods or services later; if contract data is incomplete, income may be recorded too early or liabilities omitted.
- Borrowings (loans/loan notes) and interest: incomplete loan records can lead to missing interest accruals or incorrect classification between current and non-current portions.
- Loss allowance for trade receivables (impairment/ECL estimate): poor customer history, ageing data and credit control records can lead to unrealistic impairment estimates and volatile results.
- Equity transactions: incorrect coding of share issues, dividends or owner withdrawals can distort equity presentation and retained results.
At a high level, these problems ultimately affect the balance between assets, liabilities and equity, and undermine confidence in performance measures.
Worked example
Narrative scenario
A service business is considering a new time-recording process to improve job costing and reduce under-billing. The current process results in estimated under-billing of 1.8% of invoice value, and staff spend 35 hours per month resolving billing disputes.
The proposed process is expected to reduce under-billing to 0.6% and reduce dispute-handling time to 15 hours per month. However, it will require an additional 12 hours per month of administration time and a software subscription.
The business completes 240 jobs per month, with an average invoice value of £420. Staff time costs £18 per hour. The software subscription costs £160 per month.
The decision is whether the financial benefits of improved data capture exceed the additional running costs.
Required
- Compute the current monthly under-billing amount.
- Compute the monthly under-billing amount after the change.
- Calculate the monthly savings from reduced dispute-handling labour time.
- Determine the additional monthly running costs of the new process.
- Calculate the net monthly benefit of the improved data capture.
Solution
1) Current monthly under-billing amount
Monthly invoiced revenue:
240 × £420 = £100,800
Under-billing at 1.8%:
£100,800 × 0.018 = £1,814.40
Current under-billing amount = £1,814.40 per month
2) Monthly under-billing amount after the change
Under-billing at 0.6%:
£100,800 × 0.006 = £604.80
Reduction in under-billing:
£1,814.40 − £604.80 = £1,209.60
Benefit from reduced under-billing = £1,209.60 per month
Interpretation note: This benefit is realised only if the business can validly bill and collect the additional amounts. If improved capture causes extra billable time to be challenged or written off, the realised benefit will be lower.
3) Savings from reduced dispute-handling labour time
Current dispute-handling cost:
35 × £18 = £630
New dispute-handling cost:
15 × £18 = £270
Savings:
£630 − £270 = £360
Dispute-handling savings = £360 per month
4) Additional monthly running costs
Additional admin time:
12 × £18 = £216
Software subscription: £160
Total additional running costs:
£216 + £160 = £376
Additional running costs = £376 per month
5) Net monthly benefit
Net benefit = (under-billing reduction + dispute-handling savings) − additional running costs
= (£1,209.60 + £360) − £376
= £1,569.60 − £376
= £1,193.60 per month
Net monthly benefit = £1,193.60
Common pitfalls and misunderstandings
- Treating volume as proof of accuracy: large datasets can still be systematically wrong or biased.
- Reusing secondary data without checking definitions: what counted as a “job” or “billable hour” last year may not match this year’s decision.
- Omitting incremental costs: subscriptions, admin time, validation/review time and exception handling must be included.
- Assuming revenue improvement is automatic: under-billing reduction benefits profit only to the extent that additional billing is valid and collectible.
- Overconfidence in automation: automated capture needs validation rules and periodic review.
- Weak traceability: if corrections are not logged and authorised, outputs quickly lose credibility.
- Ignoring purpose limits and confidentiality: data collected for one purpose should not be casually repurposed, and sensitive data must be protected.
- Misclassification: wrong coding can distort product or client profitability even when total revenue appears unchanged.
Summary
Reliable decisions depend on information that is fit for purpose. Primary data can be targeted but costly; secondary data is quick but may be misaligned. Internal sources provide operational detail; external sources add context but require careful evaluation.
A workable data-capture plan focuses on the lifecycle (capture, validate, correct, approve, report and retain), with clear definitions, ownership, validation rules, traceability and security. Ethical practice—grounded in integrity, objectivity, competence and due care, confidentiality and professional behaviour—supports trust and reduces risk.
Weak data first damages management accounting outputs such as job costs and KPIs, and can also undermine reported figures through poor classification and unreliable estimates (for example, contract liabilities, borrowings and interest, and loss allowances on receivables). The worked example illustrates how improved capture can deliver a measurable net benefit when incremental costs and realistic assumptions are applied.
Glossary
Primary data
Information gathered specifically for the current decision, using definitions and methods designed for that purpose (e.g. a targeted survey or direct observation).
Secondary data
Existing information reused for a new purpose (e.g. historic records, published statistics, or prior internal reports).
Internal data
Information generated within the organisation’s processes (e.g. time records, invoices, production logs, inventory movements).
External data
Information obtained from outside the organisation (e.g. competitor prices, supplier quotations, economic indicators, regulatory guidance).
Data governance
The ownership, rules and controls that determine how data is defined, accessed, stored, changed and reviewed.
Bias
A systematic distortion that makes conclusions unreliable, arising from sampling, measurement, classification or interpretation.
Confidentiality
Protecting sensitive information from unauthorised access or disclosure, including personal data and commercially sensitive details.
Consent
One possible lawful basis for using personal information in some situations. It is not always appropriate (particularly where consent may not be freely given), and organisations must be clear about the basis used and the purpose of processing.
Data minimisation
Collecting only what is needed for the stated purpose, reducing security exposure and lowering the risk of misuse.
Audit trail
A traceable record that shows where data came from, what was done to it, who approved changes and how outputs were produced.
Information security
Practical protections that reduce the risk of data loss, corruption or unauthorised access (e.g. permissions, encryption, backups and monitoring).
Contract liability
A balance that can arise when consideration is received (or becomes due) before goods or services are provided, reflecting an obligation to deliver in the future.
Loss allowance (trade receivables)
An impairment estimate for expected non-collection from receivables, based on credit risk information, ageing and experience.
Test your knowledge
Practice questions specifically for this topic.
Written by
AccountingBody Editorial Team